!python --version
!conda info
/usr/bin/zsh: /home/cytech/miniconda3/lib/python3.10/site-packages/cv2/../../../../lib/libtinfo.so.6: no version information available (required by /usr/bin/zsh)
active environment : base
active env location : /home/cytech/miniconda3
shell level : 1
user config file : /home/cytech/.condarc
populated config files :
conda version : 23.1.0
conda-build version : not installed
python version : 3.10.9.final.0
virtual packages : __archspec=1=x86_64
__glibc=2.31=0
__linux=5.4.0=0
__unix=0=0
base environment : /home/cytech/miniconda3 (writable)
conda av data dir : /home/cytech/miniconda3/etc/conda
conda av metadata url : None
channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /home/cytech/miniconda3/pkgs
/home/cytech/.conda/pkgs
envs directories : /home/cytech/miniconda3/envs
/home/cytech/.conda/envs
platform : linux-64
user-agent : conda/23.1.0 requests/2.28.1 CPython/3.10.9 Linux/5.4.0-139-generic ubuntu/20.04.5 glibc/2.31
UID:GID : 1000:1001
netrc file : None
offline mode : False
!conda list
/usr/bin/zsh: /home/cytech/miniconda3/lib/python3.10/site-packages/cv2/../../../../lib/libtinfo.so.6: no version information available (required by /usr/bin/zsh) # packages in environment at /home/cytech/miniconda3: # # Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu absl-py 1.4.0 pypi_0 pypi anyio 3.5.0 py310h06a4308_0 appdirs 1.4.4 pyhd3eb1b0_0 argon2-cffi 21.3.0 pyhd3eb1b0_0 argon2-cffi-bindings 21.2.0 py310h7f8727e_0 asttokens 2.0.5 pyhd3eb1b0_0 astunparse 1.6.3 pypi_0 pypi attrs 22.1.0 py310h06a4308_0 babel 2.11.0 py310h06a4308_0 backcall 0.2.0 pyhd3eb1b0_0 beautifulsoup4 4.11.1 py310h06a4308_0 blas 1.0 openblas bleach 4.1.0 pyhd3eb1b0_0 blosc 1.21.3 h6a678d5_0 brotli 1.0.9 h5eee18b_7 brotli-bin 1.0.9 h5eee18b_7 brotlipy 0.7.0 py310h7f8727e_1002 brunsli 0.1 h2531618_0 bzip2 1.0.8 h7b6447c_0 c-ares 1.18.1 h7f8727e_0 ca-certificates 2023.01.10 h06a4308_0 cachetools 5.3.0 pypi_0 pypi cairo 1.16.0 hb05425b_3 certifi 2022.12.7 py310h06a4308_0 cffi 1.15.1 py310h5eee18b_3 cfitsio 3.470 h5893167_7 charls 2.2.0 h2531618_0 charset-normalizer 2.0.4 pyhd3eb1b0_0 cloudpickle 2.0.0 pyhd3eb1b0_0 comm 0.1.2 py310h06a4308_0 conda 23.1.0 py310h06a4308_0 conda-content-trust 0.1.3 py310h06a4308_0 conda-package-handling 2.0.2 py310h06a4308_0 conda-package-streaming 0.7.0 py310h06a4308_0 contourpy 1.0.7 pypi_0 pypi cryptography 39.0.1 py310h9ce1e76_0 cycler 0.11.0 pypi_0 pypi cytoolz 0.12.0 py310h5eee18b_0 dask-core 2022.7.0 py310h06a4308_0 dbus 1.13.18 hb2f20db_0 debugpy 1.5.1 py310h295c915_0 decorator 5.1.1 pyhd3eb1b0_0 defusedxml 0.7.1 pyhd3eb1b0_0 eigen 3.3.7 hd09550d_1 entrypoints 0.4 py310h06a4308_0 executing 0.8.3 pyhd3eb1b0_0 expat 2.4.9 h6a678d5_0 ffmpeg 4.2.2 h20bf706_0 flatbuffers 23.1.21 pypi_0 pypi flit-core 3.6.0 pyhd3eb1b0_0 fontconfig 2.14.1 h52c9d5c_1 fonttools 4.38.0 pypi_0 pypi freetype 2.12.1 h4a9f257_0 fsspec 2022.11.0 py310h06a4308_0 gast 0.4.0 pypi_0 pypi giflib 5.2.1 h5eee18b_3 glib 2.69.1 he621ea3_2 gmp 6.2.1 h295c915_3 gnutls 3.6.15 he1e5248_0 google-auth 2.16.1 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi graphite2 1.3.14 h295c915_1 grpcio 1.51.3 pypi_0 pypi gst-plugins-base 1.14.1 h6a678d5_1 gstreamer 1.14.1 h5eee18b_1 h5py 3.8.0 pypi_0 pypi harfbuzz 4.3.0 hd55b92a_0 hdf5 1.10.6 h3ffc7dd_1 icu 58.2 he6710b0_3 idna 3.4 py310h06a4308_0 imagecodecs 2021.8.26 py310h46e8fbd_2 imageio 2.26.0 py310h06a4308_0 ipykernel 6.19.2 py310h2f386ee_0 ipython 8.10.0 py310h06a4308_0 ipython_genutils 0.2.0 pyhd3eb1b0_1 jedi 0.18.1 py310h06a4308_1 jinja2 3.1.2 py310h06a4308_0 joblib 1.2.0 pypi_0 pypi jpeg 9e h5eee18b_1 json5 0.9.6 pyhd3eb1b0_0 jsonschema 4.17.3 py310h06a4308_0 jupyter_client 7.4.9 py310h06a4308_0 jupyter_core 5.2.0 py310h06a4308_0 jupyter_server 1.23.4 py310h06a4308_0 jupyterlab 3.5.0 pyhd8ed1ab_0 conda-forge jupyterlab_pygments 0.1.2 py_0 jupyterlab_server 2.19.0 py310h06a4308_0 jxrlib 1.1 h7b6447c_2 keras 2.11.0 pypi_0 pypi kiwisolver 1.4.4 pypi_0 pypi krb5 1.19.4 h568e23c_0 lame 3.100 h7b6447c_0 lcms2 2.12 h3be6417_0 ld_impl_linux-64 2.38 h1181459_1 lerc 3.0 h295c915_0 libaec 1.0.4 he6710b0_1 libbrotlicommon 1.0.9 h5eee18b_7 libbrotlidec 1.0.9 h5eee18b_7 libbrotlienc 1.0.9 h5eee18b_7 libclang 15.0.6.1 pypi_0 pypi libcurl 7.87.0 h91b91d3_0 libdeflate 1.17 h5eee18b_0 libedit 3.1.20221030 h5eee18b_0 libev 4.33 h7f8727e_1 libevent 2.1.12 h8f2d780_0 libffi 3.4.2 h6a678d5_6 libgcc-ng 11.2.0 h1234567_1 libgfortran-ng 11.2.0 h00389a5_1 libgfortran5 11.2.0 h1234567_1 libgomp 11.2.0 h1234567_1 libidn2 2.3.2 h7f8727e_0 libllvm10 10.0.1 hbcb73fb_5 libnghttp2 1.46.0 hce63b2e_0 libopenblas 0.3.21 h043d6bf_0 libopus 1.3.1 h7b6447c_0 libpng 1.6.39 h5eee18b_0 libpq 12.9 h16c4e8d_3 libprotobuf 3.20.3 he621ea3_0 libsodium 1.0.18 h7b6447c_0 libssh2 1.10.0 h8f2d780_0 libstdcxx-ng 11.2.0 h1234567_1 libtasn1 4.16.0 h27cfd23_0 libtiff 4.5.0 h6a678d5_2 libunistring 0.9.10 h27cfd23_0 libuuid 1.41.5 h5eee18b_0 libvpx 1.7.0 h439df22_0 libwebp 1.2.4 h11a3e52_1 libwebp-base 1.2.4 h5eee18b_1 libxcb 1.15 h7f8727e_0 libxkbcommon 1.0.1 hfa300c1_0 libxml2 2.9.14 h74e7548_0 libxslt 1.1.35 h4e12654_0 libzopfli 1.0.3 he6710b0_0 locket 1.0.0 py310h06a4308_0 lxml 4.9.1 py310h1edc446_0 lz4-c 1.9.4 h6a678d5_0 markdown 3.4.1 pypi_0 pypi markupsafe 2.1.1 py310h7f8727e_0 matplotlib 3.7.0 pypi_0 pypi matplotlib-inline 0.1.6 py310h06a4308_0 mistune 0.8.4 py310h7f8727e_1000 nbclassic 0.5.2 py310h06a4308_0 nbclient 0.5.13 py310h06a4308_0 nbconvert 6.5.4 py310h06a4308_0 nbformat 5.7.0 py310h06a4308_0 ncurses 6.4 h6a678d5_0 nest-asyncio 1.5.6 py310h06a4308_0 nettle 3.7.3 hbbd107a_1 networkx 2.8.4 py310h06a4308_0 nodejs 12.4.0 he1b5a44_0 conda-forge notebook 6.5.2 py310h06a4308_0 notebook-shim 0.2.2 py310h06a4308_0 nspr 4.33 h295c915_0 nss 3.74 h0370c37_0 numpy 1.24.2 pypi_0 pypi numpy-base 1.23.5 py310h375b286_0 oauthlib 3.2.2 pypi_0 pypi opencv 4.6.0 py310h1128e8f_3 openh264 2.1.1 h4ff587b_0 openjpeg 2.4.0 h3ad879b_0 openssl 1.1.1t h7f8727e_0 opt-einsum 3.3.0 pypi_0 pypi packaging 22.0 py310h06a4308_0 pandoc 2.12 h06a4308_3 pandocfilters 1.5.0 pyhd3eb1b0_0 parso 0.8.3 pyhd3eb1b0_0 partd 1.2.0 pyhd3eb1b0_1 pcre 8.45 h295c915_0 pexpect 4.8.0 pyhd3eb1b0_3 pickleshare 0.7.5 pyhd3eb1b0_1003 pillow 9.4.0 pypi_0 pypi pip 22.3.1 py310h06a4308_0 pixman 0.40.0 h7f8727e_1 platformdirs 2.5.2 py310h06a4308_0 pluggy 1.0.0 py310h06a4308_1 pooch 1.4.0 pyhd3eb1b0_0 prometheus_client 0.14.1 py310h06a4308_0 prompt-toolkit 3.0.36 py310h06a4308_0 protobuf 3.19.6 pypi_0 pypi psutil 5.9.0 py310h5eee18b_0 ptyprocess 0.7.0 pyhd3eb1b0_2 pure_eval 0.2.2 pyhd3eb1b0_0 pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pycosat 0.6.4 py310h5eee18b_0 pycparser 2.21 pyhd3eb1b0_0 pydot 1.4.2 pypi_0 pypi pygments 2.11.2 pyhd3eb1b0_0 pyopenssl 23.0.0 py310h06a4308_0 pyparsing 3.0.9 pypi_0 pypi pyrsistent 0.18.0 py310h7f8727e_0 pysocks 1.7.1 py310h06a4308_0 python 3.10.9 h7a1cb2a_0 python-dateutil 2.8.2 pyhd3eb1b0_0 python-fastjsonschema 2.16.2 py310h06a4308_0 python_abi 3.10 2_cp310 conda-forge pytz 2022.7 py310h06a4308_0 pywavelets 1.4.1 py310h5eee18b_0 pyyaml 6.0 py310h5eee18b_1 pyzmq 23.2.0 py310h6a678d5_0 qt-main 5.15.2 h327a75a_7 qt-webengine 5.15.9 hd2b0992_4 qtwebkit 5.212 h4eab89a_4 readline 8.2 h5eee18b_0 requests 2.28.1 py310h06a4308_0 requests-oauthlib 1.3.1 pypi_0 pypi rsa 4.9 pypi_0 pypi ruamel.yaml 0.17.21 py310h5eee18b_0 ruamel.yaml.clib 0.2.6 py310h5eee18b_1 scikit-image 0.19.3 py310h6a678d5_1 scikit-learn 1.2.1 pypi_0 pypi scipy 1.10.1 pypi_0 pypi send2trash 1.8.0 pyhd3eb1b0_1 setuptools 65.6.3 py310h06a4308_0 six 1.16.0 pyhd3eb1b0_1 sklearn 0.0.post1 pypi_0 pypi snappy 1.1.9 h295c915_0 sniffio 1.2.0 py310h06a4308_1 soupsieve 2.3.2.post1 py310h06a4308_0 sqlite 3.40.1 h5082296_0 stack_data 0.2.0 pyhd3eb1b0_0 tensorboard 2.11.2 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow 2.11.0 pypi_0 pypi tensorflow-estimator 2.11.0 pypi_0 pypi tensorflow-io-gcs-filesystem 0.30.0 pypi_0 pypi termcolor 2.2.0 pypi_0 pypi terminado 0.17.1 py310h06a4308_0 threadpoolctl 3.1.0 pypi_0 pypi tifffile 2021.7.2 pyhd3eb1b0_2 tinycss2 1.2.1 py310h06a4308_0 tk 8.6.12 h1ccaba5_0 tomli 2.0.1 py310h06a4308_0 toolz 0.12.0 py310h06a4308_0 tornado 6.2 py310h5eee18b_0 tqdm 4.64.1 py310h06a4308_0 traitlets 5.7.1 py310h06a4308_0 typing-extensions 4.4.0 py310h06a4308_0 typing_extensions 4.4.0 py310h06a4308_0 tzdata 2022g h04d1e81_0 urllib3 1.26.14 py310h06a4308_0 wcwidth 0.2.5 pyhd3eb1b0_0 webencodings 0.5.1 py310h06a4308_1 websocket-client 0.58.0 py310h06a4308_4 werkzeug 2.2.3 pypi_0 pypi wheel 0.37.1 pyhd3eb1b0_0 wrapt 1.14.1 pypi_0 pypi x264 1!157.20191217 h7b6447c_0 xz 5.2.10 h5eee18b_1 yaml 0.2.5 h7b6447c_0 zeromq 4.3.4 h2531618_0 zfp 0.5.5 h295c915_6 zlib 1.2.13 h5eee18b_0 zstandard 0.19.0 py310h5eee18b_0 zstd 1.5.2 ha4553b6_0
intro_images_path='./Images/Intro_images/'
edge_images_path='./Images/Edge_images/'
seg_images_path='./Images/Seg_images/'
feature_images_path='./Images/Feature_images/'
output_images_path='./Images/Output_images/'
print('Image paths ....')
Image paths ....
https://docs.opencv.org/4.3.0/dc/d2e/tutorial_py_image_display.html
https://docs.opencv.org/3.4.2/dc/d2e/tutorial_py_image_display.html
Repeat the first exercises on this page:
https://docs.opencv.org/4.3.0/dc/d2e/tutorial_py_image_display.html
https://docs.opencv.org/3.4.2/dc/d2e/tutorial_py_image_display.html
Use the function cv2.imread() to read an image. The image should be in the working directory or a full path of image should be given. Second argument is a flag which specifies the way image should be read:
cv2.IMREAD_COLOR : Loads a color image. Any transparency of image will be neglected. It is the default flag.
cv2.IMREAD_GRAYSCALE : Loads image in grayscale mode.
cv2.IMREAD_UNCHANGED : Loads image as such including alpha channel.
Instead of these three flags, you can simply pass integers 1, 0 or -1 respectively.
import numpy as np
from matplotlib import pyplot as plt
import cv2
import sys
# Load an color image in grayscale
img = cv2.imread(intro_images_path+'pazo.jpg',0)
if img is None:
sys.exit('Failing at loading image')
print(cv2.__version__)
4.6.0
Use the function cv2.imshow() to display an image in a window. The window automatically fits to the image size.
First argument is a window name which is a string. Second argument is our image. You can create as many windows as you wish, but with different window names.
cv2.waitKey() is a keyboard binding function. Its argument is the time in milliseconds. The function waits for specified milliseconds for any keyboard event. If you press any key in that time, the program continues. If 0 is passed, it waits indefinitely for a key stroke. It can also be set to detect specific key strokes.
If you are using a 64-bit machine, you will have to modify k = cv2.waitKey(0) line as follows : k = cv2.waitKey(0) & 0xFF
cv2.destroyAllWindows() simply destroys all the windows we created. If you want to destroy any specific window, use the function cv2.destroyWindow() where you pass the exact window name as the argument.
import numpy as np
from matplotlib import pyplot as plt
import cv2
img = cv2.imread(intro_images_path+'pazo.jpg')
plt.imshow(img)
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
#cv2.waitKey(0)
#cv2.destroyAllWindows()
Use the function cv2.imwrite() to save an image. First argument is the file name, second argument is the image you want to save.
This will save the image in PNG format in the working directory:
import numpy as np
import cv2
#Load an color image in grayscale
img = cv2.imread(intro_images_path+'pazo.jpg')
cv2.imshow('image',img)
k = cv2.waitKey(0) & 0xFF
if k == 27: # wait for ESC key to exit
cv2.destroyAllWindows()
elif k == ord('s'): # wait for 's' key to save and exit
cv2.imwrite(output_images_path+'pazo_gray.jpg',img)
cv2.destroyAllWindows()
np.set_printoptions(edgeitems=25)
print("Data type = {}\n".format(img.dtype))
print("Object type = {}\n".format(type(img)))
print('Image size:{}\n'.format(img.shape))
print('Channel G of the {} upper left window:\n'.format('20x20'))
print(img[0:20,0:20,1])
print('\nOrigin (0,0), first index row, second index column:')
print('Element (1,0): {}'.format(img[1,0,1]))
print('Element (0,1): {}\n'.format(img[0,1,1]))
Data type = uint8 Object type = <class 'numpy.ndarray'> Image size:(600, 886, 3) Channel G of the 20x20 upper left window: [[51 52 52 53 52 52 52 52 51 52 53 53 51 51 52 53 52 52 52 52] [51 52 52 53 52 52 52 52 51 51 52 52 51 51 51 52 53 53 53 53] [52 52 52 53 52 52 52 52 51 52 52 52 51 51 52 52 53 53 53 53] [52 52 52 52 52 52 52 52 52 53 53 53 53 52 53 53 53 53 53 53] [51 51 52 52 52 52 52 52 52 53 54 54 53 53 54 55 53 53 53 53] [51 51 52 52 52 52 52 52 52 52 53 53 53 53 53 54 54 54 54 55] [51 51 52 52 52 53 52 52 52 53 53 53 53 53 54 54 55 55 55 55] [51 51 52 52 53 53 53 52 53 53 54 54 54 54 54 55 54 54 54 54] [51 53 54 53 52 51 52 54 53 53 53 52 52 53 54 54 54 55 55 54] [51 52 53 53 52 52 53 54 53 53 53 53 53 54 54 54 55 55 55 55] [51 52 53 53 53 53 54 55 53 54 54 54 54 54 54 54 56 56 56 56] [51 52 53 53 53 53 54 55 53 53 54 55 55 55 54 54 56 56 56 56] [53 53 54 54 53 53 54 54 53 53 54 55 55 55 55 55 56 56 56 56] [54 54 54 54 54 54 54 54 53 54 54 54 55 55 55 55 56 56 56 56] [53 53 53 54 54 54 54 54 54 54 54 54 55 56 56 56 56 56 56 56] [53 53 53 54 54 55 55 55 55 55 55 54 55 56 57 57 56 56 57 57] [55 54 55 56 56 54 53 54 55 55 56 56 55 56 56 56 55 56 56 56] [55 54 54 56 56 54 53 54 55 54 54 54 55 56 55 55 55 56 56 56] [54 54 54 55 55 54 54 54 56 55 53 54 56 57 56 54 55 55 56 56] [54 54 55 56 56 55 55 54 57 56 55 56 57 57 56 56 55 55 56 56]] Origin (0,0), first index row, second index column: Element (1,0): 51 Element (0,1): 52
Matplotlib is a plotting library for Python which gives you wide variety of plotting methods. Here, you will learn how to display image with Matplotlib. You can zoom images, save it, etc., using Matplotlib. There is a slight difference in pixel ordering in OpenCV and Matplotlib. OpenCV follows BGR order, while matplotlib likely follows RGB order. So when you display an image loaded in OpenCV using pylab functions, you may need to convert it into RGB mode. Below program loads a grayscale image, displays it, saves the image if you press ‘s’ and exits, or simply exits without saving if you press ESC key.
import numpy as np
from matplotlib import pyplot as plt
import cv2
#Load an color image in grayscale
img = cv2.imread(intro_images_path+'pazo.jpg',1)
b,g,r = cv2.split(img) # get b,g,r
img = cv2.merge([r,g,b]) # switch it to rgb
#img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #alternatively...
img=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print("Data type = {}\n".format(img.dtype))
print("Object type = {}\n".format(type(img)))
print("Image size = {}\n".format(img.shape))
print(' {} upper left window in gray-level:\n'.format('20x20'))
print(img[0:20,0:20])
print('\nOrigin (0,0), first index row, second index column:')
print('Element (1,0): {}'.format(img[1,0]))
print('Element (0,1): {}\n'.format(img[0,1]))
plt.imshow(img,cmap='gray')
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
retval=cv2.imwrite(output_images_path+'img_gray.png',img)
if retval is True:
print('saved image!')
else:
print('Error saving image!')
Data type = uint8 Object type = <class 'numpy.ndarray'> Image size = (600, 886) 20x20 upper left window in gray-level: [[76 77 77 78 78 79 79 79 78 79 80 80 78 78 79 80 79 79 79 79] [76 77 77 78 78 79 79 79 78 78 79 79 78 78 78 79 80 80 80 80] [77 77 77 78 78 79 79 79 78 79 79 79 78 78 79 79 80 80 80 80] [77 77 77 78 78 79 79 79 79 80 80 80 80 79 80 80 80 80 80 80] [77 77 78 78 78 79 79 79 79 80 81 81 80 80 81 82 80 80 80 80] [77 77 78 78 78 79 79 79 79 79 80 80 80 80 80 81 81 81 81 82] [77 77 78 78 78 80 79 79 79 80 80 80 80 80 81 81 82 82 82 82] [77 77 78 78 79 80 80 79 80 80 81 81 81 81 81 82 81 81 81 81] [77 79 80 79 78 78 79 81 80 80 80 79 79 80 81 81 80 81 81 81] [77 78 79 79 78 79 80 81 80 80 80 80 80 81 81 81 81 81 81 81] [77 78 79 79 79 80 81 82 80 81 81 81 81 81 81 81 82 82 82 82] [77 78 79 79 79 80 81 82 80 80 81 82 82 82 81 81 82 82 82 82] [79 79 80 80 79 80 81 81 80 80 81 82 82 81 81 81 82 82 82 82] [80 80 80 80 80 81 81 81 80 81 81 81 81 81 81 81 82 82 82 82] [79 79 79 80 80 81 81 81 81 81 81 81 81 82 82 82 82 82 82 82] [79 79 79 80 80 82 82 82 82 82 82 81 81 82 83 83 82 82 83 83] [82 81 82 83 83 81 80 81 82 82 83 83 82 83 83 83 82 83 83 83] [82 81 81 83 83 81 80 81 82 81 81 81 82 83 82 82 82 83 83 83] [81 81 81 82 82 81 81 81 83 82 80 81 83 84 83 81 82 82 83 83] [81 81 82 83 83 82 82 81 84 83 82 83 84 84 83 83 82 82 83 83]] Origin (0,0), first index row, second index column: Element (1,0): 76 Element (0,1): 77
Error saving image!
https://docs.opencv.org/4.3.0/df/d9d/tutorial_py_colorspaces.html
https://docs.opencv.org/3.4.2/df/d9d/tutorial_py_colorspaces.html
There are more than 150 color-space conversion methods available in OpenCV. But we will look into only two which are most widely used ones, BGR Gray and BGR HSV.
For color conversion, we use the function cv2.cvtColor(input_image, flag) where flag determines the type of conversion.
For BGR Gray conversion we use the flags cv2.COLOR_BGR2GRAY. Similarly for BGR HSV, we use the flag cv2.COLOR_BGR2HSV.
To get other flags, just run following commands:
import cv2
flags = [i for i in dir(cv2) if i.startswith('COLOR_')]
print(flags)
['COLOR_BAYER_BG2BGR', 'COLOR_BAYER_BG2BGRA', 'COLOR_BAYER_BG2BGR_EA', 'COLOR_BAYER_BG2BGR_VNG', 'COLOR_BAYER_BG2GRAY', 'COLOR_BAYER_BG2RGB', 'COLOR_BAYER_BG2RGBA', 'COLOR_BAYER_BG2RGB_EA', 'COLOR_BAYER_BG2RGB_VNG', 'COLOR_BAYER_BGGR2BGR', 'COLOR_BAYER_BGGR2BGRA', 'COLOR_BAYER_BGGR2BGR_EA', 'COLOR_BAYER_BGGR2BGR_VNG', 'COLOR_BAYER_BGGR2GRAY', 'COLOR_BAYER_BGGR2RGB', 'COLOR_BAYER_BGGR2RGBA', 'COLOR_BAYER_BGGR2RGB_EA', 'COLOR_BAYER_BGGR2RGB_VNG', 'COLOR_BAYER_GB2BGR', 'COLOR_BAYER_GB2BGRA', 'COLOR_BAYER_GB2BGR_EA', 'COLOR_BAYER_GB2BGR_VNG', 'COLOR_BAYER_GB2GRAY', 'COLOR_BAYER_GB2RGB', 'COLOR_BAYER_GB2RGBA', 'COLOR_BAYER_GB2RGB_EA', 'COLOR_BAYER_GB2RGB_VNG', 'COLOR_BAYER_GBRG2BGR', 'COLOR_BAYER_GBRG2BGRA', 'COLOR_BAYER_GBRG2BGR_EA', 'COLOR_BAYER_GBRG2BGR_VNG', 'COLOR_BAYER_GBRG2GRAY', 'COLOR_BAYER_GBRG2RGB', 'COLOR_BAYER_GBRG2RGBA', 'COLOR_BAYER_GBRG2RGB_EA', 'COLOR_BAYER_GBRG2RGB_VNG', 'COLOR_BAYER_GR2BGR', 'COLOR_BAYER_GR2BGRA', 'COLOR_BAYER_GR2BGR_EA', 'COLOR_BAYER_GR2BGR_VNG', 'COLOR_BAYER_GR2GRAY', 'COLOR_BAYER_GR2RGB', 'COLOR_BAYER_GR2RGBA', 'COLOR_BAYER_GR2RGB_EA', 'COLOR_BAYER_GR2RGB_VNG', 'COLOR_BAYER_GRBG2BGR', 'COLOR_BAYER_GRBG2BGRA', 'COLOR_BAYER_GRBG2BGR_EA', 'COLOR_BAYER_GRBG2BGR_VNG', 'COLOR_BAYER_GRBG2GRAY', 'COLOR_BAYER_GRBG2RGB', 'COLOR_BAYER_GRBG2RGBA', 'COLOR_BAYER_GRBG2RGB_EA', 'COLOR_BAYER_GRBG2RGB_VNG', 'COLOR_BAYER_RG2BGR', 'COLOR_BAYER_RG2BGRA', 'COLOR_BAYER_RG2BGR_EA', 'COLOR_BAYER_RG2BGR_VNG', 'COLOR_BAYER_RG2GRAY', 'COLOR_BAYER_RG2RGB', 'COLOR_BAYER_RG2RGBA', 'COLOR_BAYER_RG2RGB_EA', 'COLOR_BAYER_RG2RGB_VNG', 'COLOR_BAYER_RGGB2BGR', 'COLOR_BAYER_RGGB2BGRA', 'COLOR_BAYER_RGGB2BGR_EA', 'COLOR_BAYER_RGGB2BGR_VNG', 'COLOR_BAYER_RGGB2GRAY', 'COLOR_BAYER_RGGB2RGB', 'COLOR_BAYER_RGGB2RGBA', 'COLOR_BAYER_RGGB2RGB_EA', 'COLOR_BAYER_RGGB2RGB_VNG', 'COLOR_BGR2BGR555', 'COLOR_BGR2BGR565', 'COLOR_BGR2BGRA', 'COLOR_BGR2GRAY', 'COLOR_BGR2HLS', 'COLOR_BGR2HLS_FULL', 'COLOR_BGR2HSV', 'COLOR_BGR2HSV_FULL', 'COLOR_BGR2LAB', 'COLOR_BGR2LUV', 'COLOR_BGR2Lab', 'COLOR_BGR2Luv', 'COLOR_BGR2RGB', 'COLOR_BGR2RGBA', 'COLOR_BGR2XYZ', 'COLOR_BGR2YCR_CB', 'COLOR_BGR2YCrCb', 'COLOR_BGR2YUV', 'COLOR_BGR2YUV_I420', 'COLOR_BGR2YUV_IYUV', 'COLOR_BGR2YUV_YV12', 'COLOR_BGR5552BGR', 'COLOR_BGR5552BGRA', 'COLOR_BGR5552GRAY', 'COLOR_BGR5552RGB', 'COLOR_BGR5552RGBA', 'COLOR_BGR5652BGR', 'COLOR_BGR5652BGRA', 'COLOR_BGR5652GRAY', 'COLOR_BGR5652RGB', 'COLOR_BGR5652RGBA', 'COLOR_BGRA2BGR', 'COLOR_BGRA2BGR555', 'COLOR_BGRA2BGR565', 'COLOR_BGRA2GRAY', 'COLOR_BGRA2RGB', 'COLOR_BGRA2RGBA', 'COLOR_BGRA2YUV_I420', 'COLOR_BGRA2YUV_IYUV', 'COLOR_BGRA2YUV_YV12', 'COLOR_BayerBG2BGR', 'COLOR_BayerBG2BGRA', 'COLOR_BayerBG2BGR_EA', 'COLOR_BayerBG2BGR_VNG', 'COLOR_BayerBG2GRAY', 'COLOR_BayerBG2RGB', 'COLOR_BayerBG2RGBA', 'COLOR_BayerBG2RGB_EA', 'COLOR_BayerBG2RGB_VNG', 'COLOR_BayerBGGR2BGR', 'COLOR_BayerBGGR2BGRA', 'COLOR_BayerBGGR2BGR_EA', 'COLOR_BayerBGGR2BGR_VNG', 'COLOR_BayerBGGR2GRAY', 'COLOR_BayerBGGR2RGB', 'COLOR_BayerBGGR2RGBA', 'COLOR_BayerBGGR2RGB_EA', 'COLOR_BayerBGGR2RGB_VNG', 'COLOR_BayerGB2BGR', 'COLOR_BayerGB2BGRA', 'COLOR_BayerGB2BGR_EA', 'COLOR_BayerGB2BGR_VNG', 'COLOR_BayerGB2GRAY', 'COLOR_BayerGB2RGB', 'COLOR_BayerGB2RGBA', 'COLOR_BayerGB2RGB_EA', 'COLOR_BayerGB2RGB_VNG', 'COLOR_BayerGBRG2BGR', 'COLOR_BayerGBRG2BGRA', 'COLOR_BayerGBRG2BGR_EA', 'COLOR_BayerGBRG2BGR_VNG', 'COLOR_BayerGBRG2GRAY', 'COLOR_BayerGBRG2RGB', 'COLOR_BayerGBRG2RGBA', 'COLOR_BayerGBRG2RGB_EA', 'COLOR_BayerGBRG2RGB_VNG', 'COLOR_BayerGR2BGR', 'COLOR_BayerGR2BGRA', 'COLOR_BayerGR2BGR_EA', 'COLOR_BayerGR2BGR_VNG', 'COLOR_BayerGR2GRAY', 'COLOR_BayerGR2RGB', 'COLOR_BayerGR2RGBA', 'COLOR_BayerGR2RGB_EA', 'COLOR_BayerGR2RGB_VNG', 'COLOR_BayerGRBG2BGR', 'COLOR_BayerGRBG2BGRA', 'COLOR_BayerGRBG2BGR_EA', 'COLOR_BayerGRBG2BGR_VNG', 'COLOR_BayerGRBG2GRAY', 'COLOR_BayerGRBG2RGB', 'COLOR_BayerGRBG2RGBA', 'COLOR_BayerGRBG2RGB_EA', 'COLOR_BayerGRBG2RGB_VNG', 'COLOR_BayerRG2BGR', 'COLOR_BayerRG2BGRA', 'COLOR_BayerRG2BGR_EA', 'COLOR_BayerRG2BGR_VNG', 'COLOR_BayerRG2GRAY', 'COLOR_BayerRG2RGB', 'COLOR_BayerRG2RGBA', 'COLOR_BayerRG2RGB_EA', 'COLOR_BayerRG2RGB_VNG', 'COLOR_BayerRGGB2BGR', 'COLOR_BayerRGGB2BGRA', 'COLOR_BayerRGGB2BGR_EA', 'COLOR_BayerRGGB2BGR_VNG', 'COLOR_BayerRGGB2GRAY', 'COLOR_BayerRGGB2RGB', 'COLOR_BayerRGGB2RGBA', 'COLOR_BayerRGGB2RGB_EA', 'COLOR_BayerRGGB2RGB_VNG', 'COLOR_COLORCVT_MAX', 'COLOR_GRAY2BGR', 'COLOR_GRAY2BGR555', 'COLOR_GRAY2BGR565', 'COLOR_GRAY2BGRA', 'COLOR_GRAY2RGB', 'COLOR_GRAY2RGBA', 'COLOR_HLS2BGR', 'COLOR_HLS2BGR_FULL', 'COLOR_HLS2RGB', 'COLOR_HLS2RGB_FULL', 'COLOR_HSV2BGR', 'COLOR_HSV2BGR_FULL', 'COLOR_HSV2RGB', 'COLOR_HSV2RGB_FULL', 'COLOR_LAB2BGR', 'COLOR_LAB2LBGR', 'COLOR_LAB2LRGB', 'COLOR_LAB2RGB', 'COLOR_LBGR2LAB', 'COLOR_LBGR2LUV', 'COLOR_LBGR2Lab', 'COLOR_LBGR2Luv', 'COLOR_LRGB2LAB', 'COLOR_LRGB2LUV', 'COLOR_LRGB2Lab', 'COLOR_LRGB2Luv', 'COLOR_LUV2BGR', 'COLOR_LUV2LBGR', 'COLOR_LUV2LRGB', 'COLOR_LUV2RGB', 'COLOR_Lab2BGR', 'COLOR_Lab2LBGR', 'COLOR_Lab2LRGB', 'COLOR_Lab2RGB', 'COLOR_Luv2BGR', 'COLOR_Luv2LBGR', 'COLOR_Luv2LRGB', 'COLOR_Luv2RGB', 'COLOR_M_RGBA2RGBA', 'COLOR_RGB2BGR', 'COLOR_RGB2BGR555', 'COLOR_RGB2BGR565', 'COLOR_RGB2BGRA', 'COLOR_RGB2GRAY', 'COLOR_RGB2HLS', 'COLOR_RGB2HLS_FULL', 'COLOR_RGB2HSV', 'COLOR_RGB2HSV_FULL', 'COLOR_RGB2LAB', 'COLOR_RGB2LUV', 'COLOR_RGB2Lab', 'COLOR_RGB2Luv', 'COLOR_RGB2RGBA', 'COLOR_RGB2XYZ', 'COLOR_RGB2YCR_CB', 'COLOR_RGB2YCrCb', 'COLOR_RGB2YUV', 'COLOR_RGB2YUV_I420', 'COLOR_RGB2YUV_IYUV', 'COLOR_RGB2YUV_YV12', 'COLOR_RGBA2BGR', 'COLOR_RGBA2BGR555', 'COLOR_RGBA2BGR565', 'COLOR_RGBA2BGRA', 'COLOR_RGBA2GRAY', 'COLOR_RGBA2M_RGBA', 'COLOR_RGBA2RGB', 'COLOR_RGBA2YUV_I420', 'COLOR_RGBA2YUV_IYUV', 'COLOR_RGBA2YUV_YV12', 'COLOR_RGBA2mRGBA', 'COLOR_XYZ2BGR', 'COLOR_XYZ2RGB', 'COLOR_YCR_CB2BGR', 'COLOR_YCR_CB2RGB', 'COLOR_YCrCb2BGR', 'COLOR_YCrCb2RGB', 'COLOR_YUV2BGR', 'COLOR_YUV2BGRA_I420', 'COLOR_YUV2BGRA_IYUV', 'COLOR_YUV2BGRA_NV12', 'COLOR_YUV2BGRA_NV21', 'COLOR_YUV2BGRA_UYNV', 'COLOR_YUV2BGRA_UYVY', 'COLOR_YUV2BGRA_Y422', 'COLOR_YUV2BGRA_YUNV', 'COLOR_YUV2BGRA_YUY2', 'COLOR_YUV2BGRA_YUYV', 'COLOR_YUV2BGRA_YV12', 'COLOR_YUV2BGRA_YVYU', 'COLOR_YUV2BGR_I420', 'COLOR_YUV2BGR_IYUV', 'COLOR_YUV2BGR_NV12', 'COLOR_YUV2BGR_NV21', 'COLOR_YUV2BGR_UYNV', 'COLOR_YUV2BGR_UYVY', 'COLOR_YUV2BGR_Y422', 'COLOR_YUV2BGR_YUNV', 'COLOR_YUV2BGR_YUY2', 'COLOR_YUV2BGR_YUYV', 'COLOR_YUV2BGR_YV12', 'COLOR_YUV2BGR_YVYU', 'COLOR_YUV2GRAY_420', 'COLOR_YUV2GRAY_I420', 'COLOR_YUV2GRAY_IYUV', 'COLOR_YUV2GRAY_NV12', 'COLOR_YUV2GRAY_NV21', 'COLOR_YUV2GRAY_UYNV', 'COLOR_YUV2GRAY_UYVY', 'COLOR_YUV2GRAY_Y422', 'COLOR_YUV2GRAY_YUNV', 'COLOR_YUV2GRAY_YUY2', 'COLOR_YUV2GRAY_YUYV', 'COLOR_YUV2GRAY_YV12', 'COLOR_YUV2GRAY_YVYU', 'COLOR_YUV2RGB', 'COLOR_YUV2RGBA_I420', 'COLOR_YUV2RGBA_IYUV', 'COLOR_YUV2RGBA_NV12', 'COLOR_YUV2RGBA_NV21', 'COLOR_YUV2RGBA_UYNV', 'COLOR_YUV2RGBA_UYVY', 'COLOR_YUV2RGBA_Y422', 'COLOR_YUV2RGBA_YUNV', 'COLOR_YUV2RGBA_YUY2', 'COLOR_YUV2RGBA_YUYV', 'COLOR_YUV2RGBA_YV12', 'COLOR_YUV2RGBA_YVYU', 'COLOR_YUV2RGB_I420', 'COLOR_YUV2RGB_IYUV', 'COLOR_YUV2RGB_NV12', 'COLOR_YUV2RGB_NV21', 'COLOR_YUV2RGB_UYNV', 'COLOR_YUV2RGB_UYVY', 'COLOR_YUV2RGB_Y422', 'COLOR_YUV2RGB_YUNV', 'COLOR_YUV2RGB_YUY2', 'COLOR_YUV2RGB_YUYV', 'COLOR_YUV2RGB_YV12', 'COLOR_YUV2RGB_YVYU', 'COLOR_YUV420P2BGR', 'COLOR_YUV420P2BGRA', 'COLOR_YUV420P2GRAY', 'COLOR_YUV420P2RGB', 'COLOR_YUV420P2RGBA', 'COLOR_YUV420SP2BGR', 'COLOR_YUV420SP2BGRA', 'COLOR_YUV420SP2GRAY', 'COLOR_YUV420SP2RGB', 'COLOR_YUV420SP2RGBA', 'COLOR_YUV420p2BGR', 'COLOR_YUV420p2BGRA', 'COLOR_YUV420p2GRAY', 'COLOR_YUV420p2RGB', 'COLOR_YUV420p2RGBA', 'COLOR_YUV420sp2BGR', 'COLOR_YUV420sp2BGRA', 'COLOR_YUV420sp2GRAY', 'COLOR_YUV420sp2RGB', 'COLOR_YUV420sp2RGBA', 'COLOR_mRGBA2RGBA']
Functions to write text and draw different geometric shapes with OpenCV: cv.line(), cv.circle() , cv.rectangle(), cv.ellipse(), cv.putText() etc.
In all the above functions, you will see some common arguments as given below:
img: The image where you want to draw the shapes.
color: Color of the shape. for BGR, pass it as a tuple, eg: (255,0,0) for blue. For grayscale, just pass the scalar value.
thickness: Thickness of the line or circle etc. If -1 is passed for closed figures like circles, it will fill the shape. default thickness = 1
lineType: Type of line, whether 8-connected, anti-aliased line etc. By default, it is 8-connected. cv.LINE_AA gives anti-aliased line which looks great for curves.
Learn more in https://docs.opencv.org/4.x/dc/da5/tutorial_py_drawing_functions.html
import numpy as np
from matplotlib import pyplot as plt
import cv2
image = cv2.imread(intro_images_path+'overexposed.jpg')
# Draw two circles (one full),one rectange,
cv2.circle(image, center=(120, 110), radius=90, color=(0, 0, 255), thickness=2)
cv2.circle(image, (350, 160), 10, (255, 0, 0), -1)
cv2.rectangle(image, (370, 100), (465, 300), (0, 0, 255),2)
#escribimos un texto na imaxe
cv2.putText(image, "Text!",org=(30,30),fontFace=cv2.FONT_HERSHEY_SIMPLEX,fontScale=1,color=(0, 255,0), thickness=2)
# visualizamos
plt.imshow(image)
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
Let's se the effect of using different spatial sampling and intensity levels of quantization.
import cv2
import time
import numpy as np
import matplotlib.pyplot as plt
#Load an image
I = cv2.imread(intro_images_path +"bridge.jpg", 1)
if I is None:
raise Exception("File not found!")
I = cv2.cvtColor(I, cv2.COLOR_BGR2RGB)
#subsampling of the original image and visualization at the relative scale
Int_sampling = 2
I2 = I[::Int_sampling,::Int_sampling,:]
I3 = I2[::Int_sampling,::Int_sampling,:]
plt.figure(figsize=[10,10]);plt.imshow(I);plt.title("Original");plt.show()
plt.figure(figsize=[5,5]);plt.imshow(I2);plt.title("Subsampled once"); plt.show()
plt.figure(figsize=[2.5,2.5]);plt.imshow(I3);plt.title("Subsampled twice");plt.show()
print('Sizes I1 = {}, I2 = {} e I3 = {}'.format(I.shape, I2.shape, I3.shape))
#Visualization at the same scale
plt.figure(figsize=[10,10]);plt.imshow(I);plt.title("Original");plt.show()
plt.figure(figsize=[10,10]);plt.imshow(I2);plt.title("Subsampled once"); plt.show()
plt.figure(figsize=[10,10]);plt.imshow(I3);plt.title("Subsampled twice");plt.show()
Sizes I1 = (720, 960, 3), I2 = (360, 480, 3) e I3 = (180, 240, 3)
import cv2
import time
import numpy as np
import matplotlib.pyplot as plt
#Load an image as gray-level
I0 = cv2.imread(intro_images_path +"eiffel.jpg", 0)
if I is None:
raise Exception("File not found!")
I1 = cv2.resize(I0, fx = 0.5, fy = 0.5, dsize = None) # Resize image
height = I1.shape[0] #shape[0] The first dimension of the image , Height
width = I1.shape[1] #shape[1] The second dimension of the image , Width
print('I1.shape.{}x{}\n'.format(width, height))
plt.rcParams['image.cmap'] = 'gray'
plt.figure(figsize=[10,10]);plt.imshow(I1);plt.title("Original");plt.show()
# Create a matrix of the same size as the original image
#I2 = np.zeros((width, height), np.uint(8))
#I3 = np.zeros((width, height), np.uint(8))
#I4 = np.zeros((width, height), np.uint(8))
#I5 = np.zeros((width, height), np.uint(8))
#I6 = np.zeros((width, height), np.uint(8))
# Operate on the values of the original image matrix
I2 = np.uint8(I1/4) * 4
#I3 = np.uint8(I1/8) * 8
I4 = np.uint8(I1/16) * 16
#I5 = np.uint8(I1/32) * 32
I6 = np.uint8(I1/64) * 64
I7 = np.uint8(I1/128) * 128
# Display the resulting image
title = ['64 bits', '16', '4', '2'] # Subgraph title
img = [I2,I4,I6,I7]
for i in range(4):
plt.figure(figsize=[15,15])
plt.subplot(2, 2, i + 1) #python List from 0 Start counting , So here i+1
plt.imshow(img[i], 'gray')
plt.title(title[i])
plt.xticks([]),plt.yticks([])
plt.show()
I1.shape.384x512
https://docs.opencv.org/4.x/d1/db7/tutorial_py_histogram_begins.html
OpenCV offers cv2.calcHist() function to find the histogram. Let’s familiarize with the function and its parameters :
cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])
images : source image of type uint8 or float32. it should be given in square brackets, ie, “[img]”.
channels : it is also given in square brackets. It is the channel for which the histogram is computed. If source is grayscale its value is [0]. For color images, you can pass [0],[1] or [2] to select blue, green or red channel respectively.
mask : mask image. To find histogram of full image, it is given as “None”. But if you want to find histogram of particular region of image, you have to create a mask image for that and give it as mask.
histSize : BIN count. Need to be given in square brackets. For full scale, we pass [256].
ranges : normally, it is [0,256].
So let us start with a sample image. Simply load the image in grayscale mode and find its full histogram.
import numpy as np
import cv2
from matplotlib import pyplot as plt
#using matplotlib function
img = cv2.imread(intro_images_path+'pazo.jpg',0)
plt.title("matplotlib")
plt.hist(img.ravel(),256,[0,256]); plt.show()
#using opencv function
hist_item = cv2.calcHist([img],[0],None,[256],[0,256])
hist_norm=hist_item/img.size
plt.plot(hist_norm)
plt.ylabel('Frequency')
plt.xlim([0,256])
plt.title("opencv")
plt.show()
hist is a 256x1 array, each value corresponds to the number of pixels in that image with its corresponding pixel value.
A mask (white rectangle) can be created this way:
# Create the basic black image
mask = np.zeros(img.shape, dtype = "uint8")
# Draw a white, filled rectangle on the mask image
x=int(0.25*img.shape[0])
y=int(0.25*img.shape[1])
h= int(0.35*img.shape[0])
w=int(0.15*img.shape[1])
#-1 for a filled rectangle
#cv2.rectangle(mask, (x,y), (x+h, y+w), (255, 255, 255), -1)
mask[x:x+h,y:y+w]=255
# Display constructed mask
#cv2.namedWindow("Mask", cv2.WINDOW_NORMAL)
#cv2.imshow("Mask", mask)
#cv2.waitKey(0)
plt.imshow(mask, cmap = 'gray')#, interpolation = 'bicubic')
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
Find the histogram for some specific image regions using masks.
import numpy as np
import cv2
import matplotlib.pyplot as plt
img = cv2.imread(intro_images_path+'pazo.jpg',0)
# create a mask
mask = np.zeros(img.shape[:2], np.uint8)
mask[100:300, 100:400] = 255
masked_img =cv2.bitwise_and(img,img,mask = mask)
# Calculate histogram with mask and without mask
# Check third argument for mask
hist_mask = cv2.calcHist([img],[0],mask,[256],[0,256])
hist_full = cv2.calcHist([img],[0],None,[256],[0,256])
plt.subplot(221), plt.imshow(img,'gray'), plt.axis('off')
plt.subplot(222), plt.imshow(mask,'gray'),plt.axis('off')
plt.subplot(223), plt.imshow(masked_img,'gray'),plt.axis('off')
hist_full_norm=100*hist_full/img.size
hist_mask_norm=100*hist_mask/(200*300)
#plt.ylabel('Frequency')
plt.subplot(224), plt.plot(hist_full_norm), plt.plot(hist_mask_norm)
plt.ylabel('Frequency')
plt.xlim([0,256])
plt.show()
https://docs.opencv.org/4.3.0/da/d6e/tutorial_py_geometric_transformations.html
There can be a number of basic operations we can perform in an image. OpenCV provides two transformation functions, cv2.warpAffine() and cv2.warpPerspective(), with which you can perform all kinds of transformations. cv2.warpAffine uses a 2x3 transformation matrix, M, while cv.warpPerspective takes a 3x3 one.
dst = cv2.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]])
dst = cv2.warpPerspective(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]])
Translation is the shifting of an object's location. If you know the shift in the (x,y) direction and let it be (tx,ty), you can create the transformation matrix M as follows:
M=[[1, 0, tx],
[0, 1, ty]]
import numpy as np
from matplotlib import pyplot as plt
import cv2
import sys
# Load an color image in grayscale
img = cv2.imread(intro_images_path +'pazo.jpg')
if img is None:
sys.exit('Failing at loading image')
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
num_rows, num_cols = img.shape[:2]
translation_matrix = np.float32([ [1,0,70], [0,1,110] ])
#translation with image cropping
img_tc = cv2.warpAffine(img, translation_matrix, (num_cols, num_rows), cv2.INTER_LINEAR)
plt.imshow(img_tc)
plt.xticks([]), plt.yticks([])
plt.show()
#translation without image cropping
img_t = cv2.warpAffine(img, translation_matrix, (num_cols+ 2*int(translation_matrix[0,2]), num_rows+ 2*int(translation_matrix[1,2])), cv2.INTER_LINEAR)
plt.imshow(img_t)
plt.xticks([]), plt.yticks([])
plt.show()
Rotation of an image for an angle θ is achieved by the transformation matrix of the form
M=[[cosθ, sinθ],
[−sinθ, cosθ]]
OpenCV also provides scaled rotation with adjustable center of rotation so that you can rotate at any location you prefer.The modified transformation matrix is given by
M=[[α, β, (1−α)⋅center.x−β⋅center.y],
[-β α, β⋅center.x+(1−α)⋅center.y]]
where:
α=scale⋅cosθ,
β=scale⋅sinθ
To find this transformation matrix, write the affine transformation yourself and call OpenCV functions cv2.warpAffine(image, M, output_shape). Alternatively, OpenCV provides a function, cv2.getRotationMatrix2D(center, angle, scale).
Alternative 1
import numpy as np
from matplotlib import pyplot as plt
import cv2
def get_affine_cv(t, r, s):
sin_theta = np.sin(r)
cos_theta = np.cos(r)
a_11 = s * cos_theta
a_21 = -s * sin_theta
a_12 = s * sin_theta
a_22 = s * cos_theta
a_13 = t[0] * (1 - s * cos_theta) - s * sin_theta * t[1]
a_23 = t[1] * (1 - s * cos_theta) + s * sin_theta * t[0]
return np.array([[a_11, a_12, a_13],[a_21, a_22, a_23]])
#Load an image
img = cv2.imread(intro_images_path+'pazo.jpg')
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
rows,cols = img.shape[:2]
center=((cols-1)/2, (rows-1)/2)
angle=30
scale=0.6
M = get_affine_cv(center, np.deg2rad(angle), scale)
img_r = cv2.warpAffine(img, M, (cols, rows))
plt.imshow(img_r)
plt.xticks([]), plt.yticks([])
plt.show()
Alternative 2
import numpy as np
import cv2
img = cv2.imread(intro_images_path+'pazo.jpg')
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
rows,cols = img.shape[:2]
center=((cols-1)/2, (rows-1)/2)
angle=30
scale=0.6
M=cv2.getRotationMatrix2D(center, angle, scale)
img_r = cv2.warpAffine(img, M, (cols,rows))
plt.imshow(img_r)
plt.xticks([]), plt.yticks([])
plt.show()
This is a commonly used method in the computer vision and image processing area where we try to resize the image according to the requirement. Roughly we perform two things in the image scaling: either we enlarge the image or we shrink the image; in OpenCV, we have function cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]]); using this, we can perform the scaling of the image.
Here are the interpolation methods provided by OpenCV:
cv2.INTER_AREA uses pixel area relation for resampling. This is the best method for reducing the size of an image. When used for zooming into the image, it uses the INTER_NEAREST method.
cv2.INTER_CUBIC uses bicubic interpolation for resizing. This method performs calculations based on the 4×4 neighboring pixels for the pixels. It takes the weighted average of these 16 pixels to create a new pixel.
cv2.INTER_LINEAR is similar to the INTER_CUBIC interpolation. However, INTER_CUBIC only uses 2×2 neighboring pixels to get the weighted average for the pixels.
cv2.INTER_NEAREST uses the nearest neighbor concept for interpolation. This uses only one neighboring pixel from the image for interpolation.
import numpy as np
from matplotlib import pyplot as plt
import cv2
import matplotlib
#matplotlib.rcParams['figure.figsize'] = (6.0, 6.0)
#Load an color image in grayscale
img = cv2.imread(intro_images_path+'pazo.jpg')
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_shrinked = cv2.resize(img,None,fx=0.5, fy=0.5, interpolation = cv2.INTER_AREA)
plt.figure(figsize= (0.01*img_shrinked.shape[0],0.01*img_shrinked.shape[1]));plt.imshow(img_shrinked)
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
img_enlarged = cv2.resize(img,None,fx=1.5, fy=1.5, interpolation = cv2.INTER_CUBIC)
plt.figure(figsize= (0.01*img_enlarged.shape[0],0.01*img_enlarged.shape[1]));plt.imshow(img_enlarged)
plt.xticks([]), plt.yticks([])
plt.show()
https://docs.opencv.org/4.3.0/df/d9d/tutorial_py_colorspaces.html
https://docs.opencv.org/3.4.2/df/d9d/tutorial_py_colorspaces.html
In affine transformations all parallel lines in the source image will still be parallel in the output image. To find the transformation matrix, we need three points from the input image and their corresponding locations in the output image. Then cv2.getAffineTransform() will create a 2x3 matrix which is to be passed to cv2.warpAffine().
import numpy as np
from matplotlib import pyplot as plt
import cv2
import matplotlib
#matplotlib.rcParams['figure.figsize'] = (6.0, 6.0)
#Load an color image in grayscale
img = cv2.imread(intro_images_path+'pazo.jpg')
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
image=img.copy()
# Handpicked correspondences
pts1 = np.float32([[50,60],[600,60],[300,300]])
pts2 = np.float32([[40,50],[580,90],[260,350]])
for pt in pts1:
cv2.circle(image, (int(pt[0]), int(pt[1])),10, (255, 0, 0),-1)
for pt in pts2:
cv2.circle(image, (int(pt[0]), int(pt[1])),10, (0, 255, 0),-1)
M = cv2.getAffineTransform(pts1,pts2)
dst = cv2.warpAffine(img,M,(cols,rows))
plt.figure(figsize= (0.01*image.shape[0],0.01*image.shape[1]));plt.imshow(image)
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
plt.figure(figsize= (0.01*dst.shape[0],0.01*dst.shape[1]));plt.imshow(dst)
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
Perspective transformations are defined by a 3x3 transformation matrix. Straight lines will remain straight even after the transformation. To find this transformation matrix, you need 4 points on the first view and corresponding points on the second view. Among these 4 points, 3 of them should not be collinear. Then the transformation matrix can be found by the function cv2.getPerspectiveTransform. Then apply c2v.warpPerspective with this 3x3 transformation matrix.
The function cv2.warpPerspective() applies a perspective transformation to an image:
dst=cv2.warpPerspective(src, h, dsize)
src: input image.
dst: output image that has the size dsize and the same type as src.
h: homography (3×3 transformation matrix).
Dsize: size of the output image.
from matplotlib import pyplot as plt
import cv2
import matplotlib
#matplotlib.rcParams['figure.figsize'] = (6.0, 6.0)
#Load an color image in grayscale
img = cv2.imread(intro_images_path+'pazo.jpg')
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
rows, cols=img.shape[:2]
# Handpicked correspondences
pts1 = np.float32([[10,10], [cols-10,10], [10,rows-10], [cols-10,rows-10]])
pts2 = np.float32([[20,20], [cols-20,20], [int(0.33*cols),rows-10], [int(0.66*cols),rows-10]])
P_M = cv2.getPerspectiveTransform(pts1, pts2)
dst = cv2.warpPerspective(img, P_M, (cols,rows))
image=img.copy()
for pt in pts1:
cv2.circle(image, (int(pt[0]), int(pt[1])),10, (255, 0, 0),-1)
for pt in pts2:
cv2.circle(image, (int(pt[0]), int(pt[1])),10, (0, 255, 0),-1)
plt.figure(figsize= (0.01*image.shape[0],0.01*image.shape[1]));plt.imshow(image)
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
plt.figure(figsize= (0.01*dst.shape[0],0.01*dst.shape[1]));plt.imshow(dst)
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
Chose 4 (not all aligned) matching points
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Read first view
im1 = cv2.imread(intro_images_path+'library1.jpg', cv2.IMREAD_COLOR)
# Read second view
im2 = cv2.imread(intro_images_path+'library2.jpg', cv2.IMREAD_COLOR)
# Chose 4 (not all aligned) corresponding points between these images
# WRITE DOWN THEIR COODINATES!!
#####################################################################
# Chosen points :
# im1 : (125,165), (444,67), (13, 149), (280, 245)
# im2 : (128, 167), (392, 39), (57, 158), (237, 241)
# Handpicked correspondences
pts1 = np.float32([[125,165],[444,67],[13,149],[280,245]])
for pt in pts1:
cv2.circle(im1, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
# Handpicked correspondences
pts2 = np.float32([[128,167],[392,39],[57,158],[237,241]])
for pt in pts2:
cv2.circle(im2, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
#####################################################################
cv2.imshow('View 1', im1)
cv2.imshow('View 2', im2)
k = cv2.waitKey(0) & 0xFF
if k == 27: # wait for ESC key to exit
cv2.destroyAllWindows()
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Read first view
im1 = cv2.imread(intro_images_path+'library1.jpg', cv2.IMREAD_COLOR)
# Read second view
im2 = cv2.imread(intro_images_path+'library2.jpg', cv2.IMREAD_COLOR)
im1=cv2.cvtColor(im1, cv2.COLOR_BGR2RGB)
im2=cv2.cvtColor(im2, cv2.COLOR_BGR2RGB)
# Location of reference points
points1 = np.zeros((4, 2), dtype=np.float32)
points2 = np.zeros((4, 2), dtype=np.float32)
# INSERT HERE THE COORDINATES YOU HAVE WROTE DOWN
# CHANGING THE CURRENT VALUES!!
points1[0, :] = [125,165]
points1[1, :] = [444,67]
points1[2, :] = [13,149]
points1[3, :] = [280,245]
points2[0, :] = [128,167]
points2[1, :] = [392,39]
points2[2, :] = [57,158]
points2[3, :] = [237,241]
im1c=im1.copy()
im2c=im2.copy()
for pt in points1:
cv2.circle(im1c, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
for pt in points2:
cv2.circle(im2c, (int(pt[0]), int(pt[1])),5, (0, 255, 0),-1)
plt.imshow(im1c), plt.title("First view")
plt.axis('off')
plt.show()
plt.imshow(im2c), plt.title("Second view")
plt.axis('off')
plt.show()
# Find homography
h, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
# Use homography to align im1 with im2
height1, width1, _ = im1.shape
height2, width2, _ = im2.shape
#Aligned image im1Reg, using a 3x3 transformation matrix
imP = cv2.warpPerspective(im1, h, (1*width1 + 0*width2, 1*height1+ 0*height2))
for pt in points1:
cv2.circle(imP, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
for pt in points2:
cv2.circle(imP, (int(pt[0]), int(pt[1])),5, (0, 255, 0),-1)
plt.imshow(imP), plt.title("Transformed image")
plt.axis('off')
plt.show()
# Print estimated homography
print("Estimated homography : \n", h)
Estimated homography : [[ 5.73052578e-01 -7.33662577e-02 5.95436389e+01] [-1.24287365e-01 8.66862485e-01 2.78524624e+01] [-4.65979916e-04 -6.98173614e-05 1.00000000e+00]]
PNG images usually have four channels. Three color channels for RGB, and the fourth channel is for transparency, also called alpha channel.
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Read first view
im1 = cv2.imread(intro_images_path+'Logo_Paris.png', cv2.IMREAD_COLOR)
# Read second view
im2 = cv2.imread(intro_images_path+'eiffel.jpg', cv2.IMREAD_COLOR)
## Chose 4 (not all aligned) corresponging points
####################################################### Me here
# Handpicked correspondences
pts1 = np.float32([[300,0],[300,1800],[1800,0],[1800,1800]])
for pt in pts1:
cv2.circle(im1, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
# Handpicked correspondences
pts2 = np.float32([[150,150],[150,350],[375,150],[375,350]])
for pt in pts2:
cv2.circle(im2, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
####################################################### End Me here
cv2.imshow('View 1', im1)
cv2.imshow('View 2', im2)
k = cv2.waitKey(0) & 0xFF
if k == 27: # wait for ESC key to exit
cv2.destroyAllWindows()
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Read first view
im1 = cv2.imread(intro_images_path+'Logo_Paris.png', cv2.IMREAD_UNCHANGED)
# Read second view
im2 = cv2.imread(intro_images_path+'eiffel.jpg', cv2.IMREAD_COLOR)
mask=im1[:,:,3]
im1=cv2.cvtColor(im1, cv2.COLOR_BGR2RGB)
im2=cv2.cvtColor(im2, cv2.COLOR_BGR2RGB)
# Location of reference points
points1 = np.zeros((4, 2), dtype=np.float32)
points2 = np.zeros((4, 2), dtype=np.float32)
points1[0, :] = [300,0]
points1[1, :] = [300,1800]
points1[2, :] = [1800, 0]
points1[3, :] = [1800,1800]
points2[0, :] = [150,150]
points2[1, :] = [150,350]
points2[2, :] = [375,150]
points2[3, :] = [375,350]
im1c=im1.copy()
im2c=im2.copy()
for pt in points1:
cv2.circle(im1c, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
for pt in points2:
cv2.circle(im2c, (int(pt[0]), int(pt[1])),5, (0, 255, 0),-1)
plt.imshow(im1c), plt.title("First view")
plt.axis('off')
plt.show()
plt.imshow(im2c), plt.title("Second view")
plt.axis('off')
plt.show()
# Find homography
h,_ = cv2.findHomography(points1, points2, cv2.RANSAC)
# Use homography to align im1 with im2
height1, width1, _ = im1c.shape
height2, width2, _ = im2c.shape
#Aligned image im1, using a 3x3 transformation matrix
imP = cv2.warpPerspective(im1, h, (width2,height2))
maskP= cv2.warpPerspective(mask, h, (width2,height2))
maskP_inv = cv2.bitwise_not(maskP)
im_bg = cv2.bitwise_and(im2,im2,mask = maskP_inv)
im_fg = cv2.bitwise_and(imP,imP,mask = maskP)
res=cv2.add(im_bg,im_fg)# mask=im_mP)
plt.figure(figsize= (0.02*res.shape[0],0.02*res.shape[1]));
plt.imshow(res), plt.title("Combined image")
plt.axis('off')
plt.show()
# Print estimated homography
print("Estimated homography : \n", h)
Estimated homography : [[1.50000000e-01 0.00000000e+00 1.05000000e+02] [0.00000000e+00 1.11111111e-01 1.50000000e+02] [0.00000000e+00 0.00000000e+00 1.00000000e+00]]
Transform the image torre_hercules_1.jpg to get a view of the same scene as if it were captured from the same point of view as is observed torre_hercules_2.jpg
#SOLUTION HERE
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Read first view
im1 = cv2.imread(intro_images_path+'torre_hercules_1.jpg', cv2.IMREAD_COLOR)
# Read second view
im2 = cv2.imread(intro_images_path+'torre_hercules_2.jpg', cv2.IMREAD_COLOR)
## Chose 4 (not all aligned) corresponging points
####################################################### Me here
# Handpicked correspondences
pts1 = np.float32([[670,25],[805,610],[680,510],[725,1255]])
for pt in pts1:
cv2.circle(im1, (int(pt[0]), int(pt[1])),5, (0, 0, 255),-1)
# Handpicked correspondences
pts2 = np.float32([[375,50],[415,55],[200,280],[272,830]])
for pt in pts2:
cv2.circle(im2, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
####################################################### End Me here
cv2.imshow('View 1', im1)
cv2.imshow('View 2', im2)
k = cv2.waitKey(0) & 0xFF
if k == 27: # wait for ESC key to exit
cv2.destroyAllWindows()
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Read first view
im1 = cv2.imread(intro_images_path+'torre_hercules_1.jpg', cv2.IMREAD_COLOR)
# Read second view
im2 = cv2.imread(intro_images_path+'torre_hercules_2.jpg', cv2.IMREAD_COLOR)
im1=cv2.cvtColor(im1, cv2.COLOR_BGR2RGB)
im2=cv2.cvtColor(im2, cv2.COLOR_BGR2RGB)
# Location of reference points
points1 = np.zeros((4, 2), dtype=np.float32)
points2 = np.zeros((4, 2), dtype=np.float32)
# INSERT HERE THE COORDINATES YOU HAVE WROTE DOWN
# CHANGING THE CURRENT VALUES!!
points1[0, :] = [670,25]
points1[1, :] = [805,610]
points1[2, :] = [680,510]
points1[3, :] = [725,1255]
points2[0, :] = [375,50]
points2[1, :] = [415,55]
points2[2, :] = [200,280]
points2[3, :] = [272,830]
im1c=im1.copy()
im2c=im2.copy()
for pt in points1:
cv2.circle(im1c, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
for pt in points2:
cv2.circle(im2c, (int(pt[0]), int(pt[1])),5, (0, 255, 0),-1)
plt.imshow(im1c), plt.title("First view")
plt.axis('off')
plt.show()
plt.imshow(im2c), plt.title("Second view")
plt.axis('off')
plt.show()
# Find homography
h, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
# Use homography to align im1 with im2
height1, width1, _ = im1.shape
height2, width2, _ = im2.shape
#Aligned image im1Reg, using a 3x3 transformation matrix
imP = cv2.warpPerspective(im1, h, (1*width1 + 0*width2, 1*height1+ 0*height2))
for pt in points1:
cv2.circle(imP, (int(pt[0]), int(pt[1])),5, (255, 0, 0),-1)
for pt in points2:
cv2.circle(imP, (int(pt[0]), int(pt[1])),5, (0, 255, 0),-1)
plt.imshow(imP), plt.title("Transformed image")
plt.axis('off')
plt.show()
# Print estimated homography
print("Estimated homography : \n", h)
# Explanation of the decisions made :
# I chose these four points because, given the orientation
# of the tower of hercules in both images, the highest bar
# on top of the tower is on the right side in the first image,
# and it is on the left side in the second image. So I had to
# the four corresponding points in the correct side of the tower.
# The points are chosen in different key points.
# Discussion and results :
# The output in the resulting image is expected, given that
# the approximate angle between the two views in the two initial
# images is around 90 degree, so the resulting transformation
# is not the easiest one. If I decided to ignore the orientation
# of the initial image and just consider that all Hercules tower'
# sides are identical, the resulting image would've been way more
# clear. But, as we can see from the position of the top bar of the
# tower, the tower' sides aren't identical.
Estimated homography : [[-6.53275695e-01 3.95057381e-02 4.22932337e+02] [-7.23310797e-02 1.96669925e-03 4.65760246e+01] [-1.55094006e-03 9.58886219e-05 1.00000000e+00]]
import numpy as np
import cv2
import matplotlib.pyplot as plt
bg = cv2.imread(intro_images_path+'background.png',0)
fg = cv2.imread(intro_images_path+'foreground.png',0)
motion =np.absolute(cv2.subtract(fg,bg))
#plt.figure(figsize= (0.1*img.shape[0],0.1*img.shape[1]))
plt.subplot(131), plt.imshow(fg, 'gray')
plt.xticks([]), plt.yticks([])
plt.subplot(132), plt.imshow(bg,'gray')
plt.xticks([]), plt.yticks([])
plt.subplot(133), plt.imshow(motion, 'gray')
plt.xticks([]), plt.yticks([])
#plt.subplot(224), plt.plot(hist_full)
plt.show()
https://docs.opencv.org/3.4.2/d5/daf/tutorial_py_histogram_equalization.html
OpenCV has a function to do this, cv2.equalizeHist(). Its input is just a grayscale image and output is its histogram equalized image. You can take different images with different light conditions, equalize them and check the results.
Histogram equalization is good when the histogram of an image is confined to a particular region. It doesn’t work on images where both bright and dark pixels are present. In these cases and adaptive equalization is more combinient.
img = cv2.imread(intro_images_path+'fisterra_gray.jpg',0)
equ = cv2.equalizeHist(img)
#res = np.hstack((img,equ)) #stacking images side-by-side
#cv2.imwrite('fisterra_gray_eq.png',res)
plt.figure(figsize= (0.05*img.shape[0],0.05*img.shape[1]))
plt.subplot(421), plt.imshow(img, 'gray')
plt.xticks([]), plt.yticks([])
plt.subplot(422), plt.imshow(equ,'gray')
plt.xticks([]), plt.yticks([])
hist = cv2.calcHist([img],[0],None,[256],[0,256])
hist_norm=100*hist/img.size
plt.subplot(423), plt.plot(hist_norm)
hist = cv2.calcHist([equ],[0],None,[256],[0,256])
hist_norm=100*hist/equ.size
plt.subplot(424), plt.plot(hist_norm)
plt.subplot(425), plt.imshow(img, 'gray')
plt.xticks([]), plt.yticks([])
#Adaptive histogram equalization
clahe = cv2.createCLAHE(clipLimit=4.0, tileGridSize=(16,16))
cl1 = clahe.apply(img)
plt.subplot(426), plt.imshow(cl1,'gray')
plt.xticks([]), plt.yticks([])
hist = cv2.calcHist([img],[0],None,[256],[0,256])
hist_norm=100*hist/img.size
plt.subplot(427), plt.plot(hist_norm)
hist = cv2.calcHist([cl1],[0],None,[256],[0,256])
hist_norm=100*hist/cl1.size
plt.subplot(428), plt.plot(hist_norm)
plt.show()
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(intro_images_path+'pazo.jpg',0)
kernel = np.array([[0, -1, 0], [-1, 5, -1],[0, -1,0]])
dst = cv2.filter2D(img,0,kernel)
plt.figure(figsize= (0.05*img.shape[0],0.05*img.shape[1]))
plt.subplot(121),plt.imshow(img,'gray'), plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(dst,'gray'), plt.title('Filtered')
plt.xticks([]), plt.yticks([])
plt.show()
Image blurring is achieved by convolving the image with a low-pass filter kernel taht removes high frequency content (eg: noise, edges). Edges are blurred a little bit in this operation. There are several blurring techniques:
Mean filtering:
The idea of mean filtering is simply to replace each pixel value in an image with the mean value of its neighbors, including itself. This is done in by convolving an image with a normalized box filter. Opencv provide us with the function cv.blur(), which takes the width and height of the kernel as parameters. A 3x3 normalized box filter would look like the below:
K=[[1/9 1/9 1/9],
[1/9 1/9 1/9],
[1/9 1/9 1/9]]
Gaussian average
In Gaussian Blur operation, the image is convolved with a Gaussian filter instead of the box filter. Opencv provide us with the GaussianBlur(src, dst, ksize, sigma) method, which accepts the following parameters −
src − input image
dst − output image)
ksize − size of the kernel.
sigmaX − Gaussian kernel standard deviation in X and Y directions.
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(intro_images_path+'zebra.jpg')
blur1 = cv2.GaussianBlur(img,(15,15),0)
blur2 = cv2.blur(img,(15,15))
plt.figure(figsize= (0.05*img.shape[0],0.05*img.shape[1]))
plt.subplot(131),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(blur1,'gray'),plt.title('GaussBlurr')
plt.xticks([]), plt.yticks([])
plt.subplot(133),plt.imshow(blur2),plt.title('Blurr')
plt.xticks([]), plt.yticks([])
plt.show()
cv2.imwrite(output_images_path+'zebragb.jpg',blur1)
cv2.imwrite(output_images_path+'zebrab.jpg',blur2)
img = cv2.imread(intro_images_path+'gate.jpg')
img2 = np.zeros(img.shape[:2], np.uint8)
img2[100:170, 100:170] = 255
img2[180:250, 100:170] = 255
img2[100:170, 200:270] = 255
img2[180:250, 200:270] = 255
blur1 = cv2.GaussianBlur(img2,(15,15),0)
blur2 = cv2.blur(img2,(15,15))
plt.figure(figsize= (0.05*img2.shape[0],0.05*img2.shape[1]))
plt.subplot(131),plt.imshow(img2,'gray'),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(blur1,'gray'),plt.title('GaussBlurr')
plt.xticks([]), plt.yticks([])
plt.subplot(133),plt.imshow(blur2,'gray'),plt.title('Blurr')
plt.xticks([]), plt.yticks([])
plt.show()
https://docs.opencv.org/3.4.2/d4/dc6/tutorial_py_template_matching.html
Template Matching is a method for searching and finding the location of a template image in a larger image. OpenCV comes with a function cv2.matchTemplate() for this purpose. It simply slides the template image (T) over the input image (I), as in 2D correlation, and compares the template against each patch of the input image to give the image (R). Several comparison methods are implemented in OpenCV. They return a grayscale image (R), where each pixel denotes how much does the neighbourhood of that pixel match with template.
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(intro_images_path+'einstein.jpg',0)
img2 = img.copy()
template = cv2.imread(intro_images_path+'einstein_eye.jpg',0)
w, h = template.shape[::-1]
# All the 6 methods for comparison in a list
methods = ['cv2.TM_CCOEFF', 'cv2.TM_CCOEFF_NORMED', 'cv2.TM_CCORR',
'cv2.TM_CCORR_NORMED', 'cv2.TM_SQDIFF', 'cv2.TM_SQDIFF_NORMED']
for meth in methods:
img = img2.copy()
method = eval(meth)
# Apply template Matching
res = cv2.matchTemplate(img,template,method)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
# If the method is TM_SQDIFF or TM_SQDIFF_NORMED, take minimum
if method in [cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED]:
top_left = min_loc
else:
top_left = max_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
cv2.rectangle(img,top_left, bottom_right, [0,255,0], 2)
plt.subplot(121),plt.imshow(res,cmap = 'gray')
plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(img,cmap = 'gray')
plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
plt.suptitle(meth)
plt.show()
if method in [cv2.TM_CCOEFF_NORMED]:
res=(res-min_val)/(max_val-min_val)*255
cv2.imwrite(output_images_path+'einstein_ccoeff_normed.jpg',res)
An image pyramid is a collection of images usually obtained by downsampling a single original image. It is a set of layers in which the higher the layer, the smaller the size.
Image pyramids are used to achieve scale-invariant processing in contexts as the following:
The basic steps in building a Gaussian image pyramid are:
import numpy as np
import cv2
import matplotlib.pyplot as plt
from skimage import data
def pyramid_bf(image,l=5, minSizex=8, minSizey=8):
# yield the original image
level=0
yield image
# keep looping over the pyramid
while True:
if level==l or image.shape[0]<minSizex or image.shape[1]<minSizey:
break
h,w=image.shape[:2]
image=image[:h+1:2,:w+1:2]
# if the resized image does not meet the supplied minimum
# size, then stop constructing the pyramid
# yield the next image in the pyramid
yield image
level=level+1
def pyramid_sm(image,l=5, minSizex=8, minSizey=8):
level=0
yield image
while True:
if level==l or image.shape[0]<minSizex or image.shape[1]<minSizey:
break
image = cv2.GaussianBlur(image,(5,5),0)
h,w=image.shape[:2]
image=image[:h+1:2,:w+1:2]
yield image
level=level+1
image = cv2.imread(intro_images_path + 'einstein.jpg',0)#'aliasing.jpg',0)#
for layer in pyramid_bf(image,l=3):
print(layer.shape)
plt.figure(figsize= (0.02*layer.shape[0],0.02*layer.shape[1]))
plt.imshow(layer,cmap='gray')
plt.xticks([]), plt.yticks([])
plt.show()
plt.figure(figsize= (0.02*image.shape[0],0.02*image.shape[1]))
plt.imshow(layer,cmap='gray')
plt.xticks([]), plt.yticks([])
plt.show()
for layer in pyramid_sm(image, l=3):
print(layer.shape)
plt.figure(figsize= (0.02*layer.shape[0],0.02*layer.shape[1]))
plt.imshow(layer,cmap='gray')
plt.xticks([]), plt.yticks([])
plt.show()
plt.figure(figsize= (0.02*image.shape[0],0.02*image.shape[1]))
plt.imshow(layer,cmap='gray')
plt.xticks([]), plt.yticks([])
plt.show()
(491, 383)
(246, 192)
(123, 96)
(62, 48)
(491, 383)
(246, 192)
(123, 96)
(62, 48)
OpenCV has a specific function to downscale images: cv2.pyrDown()
def pyramid(layer, l=4, minSizex=8, minSizey=8):
level=0
yield layer
print(layer.shape)
while True:
layer = cv2.pyrDown(layer)
if level== l or layer.shape[0]<minSizex or layer.shape[1]<minSizey:
break
yield layer
level=level+1
for layer in pyramid(image, l=3):
print(layer.shape)
plt.figure(figsize= (0.02*layer.shape[0],0.02*layer.shape[1]))
plt.imshow(layer,cmap='gray')
plt.xticks([]), plt.yticks([])
plt.show()
plt.figure(figsize= (0.02*image.shape[0],0.02*image.shape[1]))
plt.imshow(layer,cmap='gray')
plt.xticks([]), plt.yticks([])
plt.show()
(491, 383)
(491, 383) (246, 192)
(123, 96)
(62, 48)
https://docs.opencv.org/3.4.2/d4/dc6/tutorial_py_template_matching.html
Template Matching is a method for searching and finding the location of a template image in a larger image. In order to cope with different sizes of the occurrence of the template we can take advantage of image pyramid.
Based on the exercises about template matching and the construction of Gaussian pyramids try to detect as much tunas as posible in the image Tuna_Relative_Sizes.jpg given the template in Tuna_template.jpg.
#SOLUTION HERE
import cv2
import numpy as np
from matplotlib import pyplot as plt
img_rgb = cv2.imread(intro_images_path+'Tuna_Relative_Sizes.jpg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread(intro_images_path+'Tuna_template.jpg',0)
w, h = template.shape[::-1]
# All the 6 methods for comparison in a list
# methods = ['cv2.TM_CCOEFF', 'cv2.TM_CCOEFF_NORMED', 'cv2.TM_CCORR',
# 'cv2.TM_CCORR_NORMED', 'cv2.TM_SQDIFF', 'cv2.TM_SQDIFF_NORMED']
methods = ['cv2.TM_CCOEFF_NORMED']
for meth in methods :
method = eval(meth)
res = cv2.matchTemplate(img_gray,template,method)
threshold = 0.51
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
cv.rectangle(img,top_left, bottom_right, 255, 2)
plt.subplot(121),plt.imshow(res,cmap = 'gray')
plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(img_rgb,cmap = 'gray')
plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
plt.suptitle(meth)
plt.show()
# Explanation of the decisions made :
# My choice for which method to use as well as the value of the threshold
# came after several tests, especially when I noticed that a higher threshold leads to
# less tunas detected, and a lower threshold leads to more False positive
# detections.
# Discussion and results :
# After trying with all the possible values for the final parameter of the
# cv2.matchTemplate function, it was 'cv2.TM_CCOEFF_NORMED' who had the best
# results with a threshold of 0.51, after being able to detect perfectly a
# total of 3 tunas, and it partially detected the other 4 tunas.
OpenCV provides three types of derivate (high-pass) filters: Sobel, Scharr and Laplacian. Repeat the first exercises on this page: https://docs.opencv.org/4.3.0/d5/d0f/tutorial_py_gradients.html https://docs.opencv.org/3.4.2/d5/d0f/tutorial_py_gradients.html
The Sobel operators combine Gaussian smoothing and differentiation, so the result is more or less resistant to noise.
dst= cv.Sobel(src, ddepth, dx, dy, [, ksize[, scale[, delta[, borderType]]]])
Direction of derivatives: passing values {0,1} as arguments of dx and dy respectively.
Size of kernel: argument ksize. If ksize = -1, a 3x3 Scharr filter is used which gives better results than 3x3 Sobel filter.
Gradient filters can also be applied using the cv2.Filter2D() function.
The following are the 3x3 kernels corresponding to first x and y image derivatives: bare derivative without Gaussian smoothing, Sobel and Scharr.
The Laplacian is the sum of the second partial derivatives. Opencv has the cv2.laplacian() function that implements the derivatives using the chain rule and Sobel filters.
dst= cv2.Laplacian(src, ddepth,[ksize[, scale[, delta[, borderType]]]])
If ksize == 1 (default), then following kernel is used for filtering:
K=[[0, 1, 0]
[1,-4, 1]
[0, 1, 1]]
Below code shows all operators in a single diagram. All kernels are 3x3. Depth of output image is passed to get the result in np.uint8 type.
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(edge_images_path+'hercules_tower.jpg',0)
laplacian = cv2.Laplacian(img, cv2.CV_8U,ksize=3)
sobelx = cv2.Sobel(img, cv2.CV_8U,1,0,ksize=3)
sobely = cv2.Sobel(img, cv2.CV_8U,0,1,ksize=3)
scharrx = cv2.Sobel(img, cv2.CV_8U,1,0,ksize=-1)
scharry = cv2.Sobel(img, cv2.CV_8U,0,1,ksize=-1)
plt.figure(figsize=(0.05*img.shape[0],0.05*img.shape[1]))
plt.subplot(3,2,1),plt.imshow(img,cmap = 'gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(3,2,2),plt.imshow(laplacian,cmap = 'gray')
plt.title('Laplacian'), plt.xticks([]), plt.yticks([])
plt.subplot(3,2,3),plt.imshow(sobelx,cmap = 'gray')
plt.title('Sobel X'), plt.xticks([]), plt.yticks([])
plt.subplot(3,2,4),plt.imshow(sobely,cmap = 'gray')
plt.title('Sobel Y'), plt.xticks([]), plt.yticks([])
plt.subplot(3,2,5),plt.imshow(scharrx,cmap = 'gray')
plt.title('Scharr X'), plt.xticks([]), plt.yticks([])
plt.subplot(3,2,6),plt.imshow(scharry,cmap = 'gray')
plt.title('Scharr Y'), plt.xticks([]), plt.yticks([])
plt.show()
IMPORTANT!
Output datatype can be cv2.CV_8U or np.uint8, but there is a slight problem with that. Black-to-White transition is taken as Positive slope (it has a positive value) while White-to-Black transition is taken as a Negative slope (It has negative value). So when you convert data to np.uint8, all negative slopes are made zero. In simple words, you miss that edge. If you want to detect both edges, better option is to keep the output datatype to some higher forms, like cv2.CV_64F, take its absolute value and then convert back to cv2.CV_8U. Below code demonstrates this procedure for a horizontal Sobel filter and difference in results.
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(edge_images_path+'hercules_tower.jpg',0)
laplacian = cv2.Laplacian(img, cv2.CV_8U,ksize=3)
# Output dtype = cv2.CV_8U
sobelx8u = cv2.Sobel(img,cv2.CV_8U,1,0,ksize=3)
sobely8u = cv2.Sobel(img,cv2.CV_8U,0,1,ksize=3)
# Output dtype = cv2.CV_64F. Then take its absolute and convert to cv2.CV_8U to visualze ite
sobelx64f = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=3)
sobely64f = cv2.Sobel(img,cv2.CV_64F,0,1,ksize=3)
abs_sobelx64f = np.absolute(sobelx64f)
sobelx64f_to_8u = np.uint8(abs_sobelx64f)
abs_sobely64f = np.absolute(sobely64f)
sobely64f_to_8u = np.uint8(abs_sobely64f)
plt.figure(figsize=(0.05*img.shape[0],0.05*img.shape[1]))
plt.subplot(2,2,1),plt.imshow(sobelx8u,cmap = 'gray')
plt.title('Sobel X CV_8U'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,2),plt.imshow(sobelx64f_to_8u,cmap = 'gray')
plt.title('Sobel X abs(CV_64F)'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,3),plt.imshow(sobely8u,cmap = 'gray')
plt.title('Sobel Y CV_8U'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,4),plt.imshow(sobely64f_to_8u,cmap = 'gray')
plt.title('Sobel Y abs(CV_64F)'), plt.xticks([]), plt.yticks([])
plt.show()
https://docs.opencv.org/4.3.0/da/d22/tutorial_py_canny.html https://docs.opencv.org/3.4.2/da/d22/tutorial_py_canny.html
Canny algorithm was designed to meet the criteria of good detection, good location and single response. It uses multiple stages, as shown below. OpenCV implements it as the function cv2.Canny().
edges = cv2.Canny(image, threshold1, threshold2[,edges [,apertureSize[, L2gradient]]])
You can find a detailed implementation of the Canny detector here: https://towardsdatascience.com/canny-edge-detection-step-by-step-in-python-computer-vision-b49c3a2d8123
import cv2
import numpy as np
from matplotlib import pyplot as plt
image = cv2.imread(edge_images_path+'hercules_tower.jpg',0)
img=cv2.GaussianBlur(image,(3,3),0) #some previous smothing is usually convenient
edges = cv2.Canny(img,5,15,apertureSize=3,L2gradient=True)
plt.figure(figsize=(0.05*img.shape[0],0.05*img.shape[1]))
plt.subplot(131),plt.imshow(image,cmap = 'gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(edges,cmap = 'gray')
plt.title('Edge Image s1'), plt.xticks([]), plt.yticks([])
img=cv2.GaussianBlur(image,(15,15),0) #some previous smothing is usually convenient
edges = cv2.Canny(img,5,15,apertureSize=3,L2gradient=True)
plt.subplot(133),plt.imshow(edges,cmap = 'gray')
plt.title('Edge Image s2'), plt.xticks([]), plt.yticks([])
plt.show()
kernel size and sigma are dependent values. Commonly, Gaussian distribution has 99.7% of its probability mass within $3*sigma$, so the size of the kernel should be $k\_size = 6.0*sqrt(2)*sigma$. Anyway $kº_size = 6*sigma$, or even shorter, is usually enough.
Specific implementation in OpenCV estimate the value of sigma (if 0 is passed) given the value of kernel size, but not the other way around. You can find the relationship between these two parameters (according to opencv) here: https://docs.opencv.org/2.4/modules/imgproc/doc/filtering.html#getgaussiankernel
#SOLUTION HERE
import cv2
import numpy as np
from matplotlib import pyplot as plt
image = cv2.imread(edge_images_path+'chuvia.jpg',0)
img=cv2.GaussianBlur(image,(15,15),0)
edges = cv2.Canny(img,10,20,apertureSize=3,L2gradient=True)
plt.figure(figsize=(0.05*img.shape[0],0.05*img.shape[1]))
plt.subplot(131),plt.imshow(image,cmap = 'gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(edges,cmap = 'gray')
plt.title('Edge Image'), plt.xticks([]), plt.yticks([])
plt.show()
image = cv2.imread(edge_images_path+'xardin.jpg',0)
img=cv2.GaussianBlur(image,(7,7),0)
edges = cv2.Canny(img,1,100,apertureSize=3,L2gradient=True)
plt.figure(figsize=(0.05*img.shape[0],0.05*img.shape[1]))
plt.subplot(131),plt.imshow(image,cmap = 'gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(edges,cmap = 'gray')
plt.title('Edge Image'), plt.xticks([]), plt.yticks([])
plt.show()
image = cv2.imread(edge_images_path+'paseo.jpg',0)
img=cv2.GaussianBlur(image,(21,21),0)
edges = cv2.Canny(img,15,20,apertureSize=3,L2gradient=True)
plt.figure(figsize=(0.05*img.shape[0],0.05*img.shape[1]))
plt.subplot(131),plt.imshow(image,cmap = 'gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(edges,cmap = 'gray')
plt.title('Edge Image'), plt.xticks([]), plt.yticks([])
plt.show()
# Explanation of the decisions made :
# The first parameter I changed was gaussian blurring. Depending
# on each image, I needed a certain value for the kernel so the
# maximum amount of noise could be removed and the image could
# be smoothed. Both thresholds helped me to further remove all
# the details that I didn't really want to keep, so the edge
# map could be more clean.
# Discussion and results :
# The results are pretty much good. We can clearly see that,
# for each image, we were able to retrieve the geometrical
# forms in it, while conserving a certain degree of clearniness.
# Just as a reminder again, we consider that the big water
# droplets are not noise but rather an essential part of the
# first image ! (Otherwise we could easily tweak more the
# algorithm to remove them as well)
Compare the following two template matching alternatives regarding their performance : (i) case of using image intensity as features for matching, and (ii) case of using their edge maps as matching features. Use faces.jpg and two different templates right_eye.jpg and right_eye_i.jpg. You will have to compute their edge maps before computing the correlation in case (ii).
#SOLUTION HERE
import cv2
import numpy as np
from matplotlib import pyplot as plt
img_rgb = cv2.imread(edge_images_path+'faces.jpg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread(edge_images_path+'right_eye.jpg',0)
w, h = template.shape[::-1]
methods = ['cv2.TM_CCOEFF_NORMED']
for meth in methods :
method = eval(meth)
res = cv2.matchTemplate(img_gray,template,method)
threshold = 0.5
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
cv.rectangle(img,top_left, bottom_right, 255, 2)
plt.subplot(121),plt.imshow(res,cmap = 'gray')
plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(img_rgb,cmap = 'gray')
plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
plt.suptitle(meth)
print(len(loc[0]))
plt.show()
###
846
https://docs.opencv.org/3.4.2/d6/d10/tutorial_py_houghlines.html
A line in image space can be expressed with two parameters:
For Hough Transforms, we will express lines in the Polar system. Hence, a line equation can be written as r=xcosθ+ysinθ. In general for each point (x0,y0), we can define the family of lines that goes through that point as r=x0cosθ+y0sinθ. Meaning that each pair (r,θ) represents each line that passes by (x0,y0).
Hough Transform is encapsulated in the OpenCV function cv2.HoughLines().
lines=cv.HoughLines(edges, rho, theta, threshold [, srn[, stn[, min_theta[,max_theta]]]]])
import cv2
import numpy as np
import math
from matplotlib import pyplot as plt
img = cv2.imread(edge_images_path+'building.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray=img
imgP=img.copy()
gray = cv2.GaussianBlur(img,(7,7),0)
edges = cv2.Canny(gray,1,50,apertureSize=3,L2gradient=True)
lines = cv2.HoughLines(edges,1,np.pi/180,120)
#Draw all the lines
if lines is not None:
for i in range(0,len(lines)):
rho = lines[i][0][0]
theta = lines[i][0][1]
a = np.cos(theta)
b = np.sin(theta)
x0 = a * rho
y0 = b * rho
pt1 = (int(x0 + 1000*(-b)), int(y0 + 1000*(a)))
pt2 = (int(x0 - 1000*(-b)), int(y0 - 1000*(a)))
cv2.line(imgP, pt1, pt2, (0,0,255), 2)
plt.imshow(edges,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
plt.imshow(imgP,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
A more efficient implementation of the Hough Line Transform is the function HoughLinesP(). It returns the extremes of detected lines (x0,y0,x1,y1). This function has two extra parameters:
MinLineLength - Minimum line length. Line segments shorter than it are rejected.
MaxLineGap - Maximum allowed gap between points on the same line to link them.
linesP = cv2.HoughLinesP(edges,1,np.pi/180,50,minLineLength=50,maxLineGap=10)
imgP=img.copy()
if linesP is not None:
for i in range(0, len(linesP)):
l = linesP[i][0]
cv2.line(imgP, (l[0], l[1]), (l[2], l[3]), (255,0,0), 3, cv2.LINE_AA)
plt.imshow(imgP)
plt.show()
Consider the image naval.jpg. Try to detect lines with the best trade-off between the numbers of True Positive and False Negative. Consider different levels of smoothing before applying the Canny detector.
#SOLUTION HERE
### Version 1 : Using improved Hough Tranform cv2.HoughLinesP()
import cv2
import numpy as np
import math
from matplotlib import pyplot as plt
img = cv2.imread(edge_images_path+'naval.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray=img
imgP=img.copy()
gray = cv2.GaussianBlur(img,(15,15),0)
edges = cv2.Canny(gray,45,130,apertureSize=3,L2gradient=True)
edgesP = edges.copy()
linesP = cv2.HoughLinesP(edges,1,np.pi/180,150,minLineLength=350,maxLineGap=250)
if linesP is not None:
for i in range(0, len(linesP)):
l = linesP[i][0]
cv2.line(imgP, (l[0], l[1]), (l[2], l[3]), (255,0,0), 3, cv2.LINE_AA)
cv2.line(edgesP, (l[0], l[1]), (l[2], l[3]), (255,0,0), 3, cv2.LINE_AA)
print("Number of lines : {} lines".format(len(linesP)))
plt.imshow(img,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
plt.imshow(edges,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
plt.imshow(edgesP,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
plt.imshow(imgP,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
# Explanation of the decisions made :
# Given that I wanted to do a comparison between the basic and the improved
# Hough Transform functions, I decided to start with the improved version
# to get the whole lines drawn. I had to do some parameter tuning to try
# and find the best combination possible.
# Discussion and results :
# Multiple tests and permutations later, I settled on the values that appear
# in the script above, especially given that they are the ones who provided me
# with the best ratio between true positives and false negatives. I was able
# to retrieve a total of 46 lines, with crisp lines and good coverage of the
# main parts of the image.
Number of lines : 46 lines
#SOLUTION HERE
### Version 2 : Using basic Hough Tranform cv2.HoughLines()
import cv2
import numpy as np
import math
from matplotlib import pyplot as plt
img = cv2.imread(edge_images_path+'naval.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray=img
imgP=img.copy()
gray = cv2.GaussianBlur(img,(17,17),0)
edges = cv2.Canny(gray,1,50,apertureSize=3,L2gradient=True)
edgesP = edges.copy()
lines = cv2.HoughLines(edges,1,np.pi/180,280)
if lines is not None:
for i in range(0,len(lines)):
rho = lines[i][0][0]
theta = lines[i][0][1]
a = np.cos(theta)
b = np.sin(theta)
x0 = a * rho
y0 = b * rho
pt1 = (int(x0 + 1000*(-b)), int(y0 + 1000*(a)))
pt2 = (int(x0 - 1000*(-b)), int(y0 - 1000*(a)))
cv2.line(imgP, pt1, pt2, (255,0,0), 2)
cv2.line(edgesP, pt1, pt2, (255,0,0), 2)
print("Number of lines : {} lines".format(len(lines)))
plt.imshow(img,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
plt.imshow(edges,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
plt.imshow(edgesP,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
plt.imshow(imgP,cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()
# Explanation of the decisions made :
# For this second version, I wanted to check the results if I only use the
# basic Hough Transforme cv2.HoughLines(). I spent some time tuning the
# parameters, as dealing with multiple parameters at the same time is not
# evident.
# Discussion and results :
# For the parameters, I settled on the values that are in the script above.
# I was able to retrieve a total of 31 lines, while only lacking some lines
# on the bottom right part of the image.
Number of lines : 31 lines
https://docs.opencv.org/4.x/da/d53/tutorial_py_houghcircles.html
A circle is represented mathematically by its center (xcenter,ycenter) and its radius r. As it has 3 parameters, we need a 3D accumulator for hough transform. OpenCV offers the function cv2.HoughCircles().
circles= cv2.HoughCircles(image, method, dp, minDist[, param1[, param2[, minRadius[, maxRadius]]]])
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread(edge_images_path+'faces.jpg',0)
img = cv2.GaussianBlur(img,(9,9),2)
cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
circles = cv2.HoughCircles(img,cv2.HOUGH_GRADIENT,dp=2,minDist=30, param1=40,param2=40,minRadius=30,maxRadius=40)
circles = np.uint16(np.around(circles))
for i in circles[0,:]:
# draw the outer circle
cv2.circle(cimg,(i[0],i[1]),i[2],(0,255,0),2)
# draw the center of the circle
cv2.circle(cimg,(i[0],i[1]),2,(0,0,255),3)
plt.imshow(cimg)
plt.title('Detected Circles'), plt.xticks([]), plt.yticks([])
plt.show()
Implement a program for the estimation of the number of fishes in fishes.jpg image, according to the number of detected eyes. Choose the most convenient parameters.
#SOLUTION HERE
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread(feature_images_path+'fishes.jpg',0)
img = cv2.GaussianBlur(img,(5,5),2)
cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
circles = cv2.HoughCircles(img,cv2.HOUGH_GRADIENT,dp=2,minDist=40, param1=50,param2=36,minRadius=30,maxRadius=35)
circles = np.uint16(np.around(circles))
for i in circles[0,:]:
# draw the outer circle
cv2.circle(cimg,(i[0],i[1]),i[2],(0,255,0),2)
# draw the center of the circle
cv2.circle(cimg,(i[0],i[1]),2,(0,0,255),3)
print('Approximate number of fishes is : {}'.format(len(circles[0,:])))
plt.imshow(cimg)
plt.title('Detected Circles'), plt.xticks([]), plt.yticks([])
plt.show()
# Explanation of the decisions made :
# Tuning this algorithme was easier than the last examples, as I mainly focused
# on two parameters : The higher threshold of the two passed to the Canny edge
# detector, and the accumulator threshold for the circle centers at the detection
# stage. The basic 40-40 configuration had an acceptable result with a total of
# 8 fishes detected. But I felt I could do better by further tunning the function.
# Discussion and results :
# After multiple tests, my script is able to detect perfectly 7 eyes of the
# fishes, while approximating the other fishes who are less evident to detect,
# thus being able to detect a total of 11 fishes. A solid performance overall.
Approximate number of fishes is : 11
https://docs.opencv.org/4.3.0/d7/d4d/tutorial_py_thresholding.html
In simple thresholding, if pixel value is greater than a threshold value, it is assigned a given value (may be white), else it is assigned a different value (may be black). The function to apply a threshold in OpenCV is cv2.threshold.
retval, dst=cv2.threshold(src, thresh, maxval, type)
type - OpenCV provides different types of thresholding: cv2.THRESH_BINARY, cv2.THRESH_BINARY_INV, cv2.THRESH_TRUNC, cv2.THRESH_TOZERO, cv2.THRESH_TOZERO_INV
retvaL, dst - used threshold and thresholded image, respectively
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(seg_images_path+'black-trap-rock.jpg',0)
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.imshow(img,cmap = 'gray')
plt.show()
plt.title('Histogram')
plt.hist(img.ravel(),256,[0,256]); #plt.show()
ret,thresh = cv2.threshold(img,150,255,cv2.THRESH_BINARY)
plt.show()
plt.title('Thresholded'), plt.xticks([]), plt.yticks([])
plt.imshow(thresh,cmap = 'gray')
plt.show()
# Apply the Component analysis function
analysis = cv2.connectedComponentsWithStats(thresh,4, cv2.CV_32S)
(totalLabels, label_ids, values, centroid) = analysis
# Initialize a new image to store
# all the output components
output = np.zeros(img.shape, dtype="uint8")
# Loop through each component
for i in range(1, totalLabels):
# Area of the component
area = values[i, cv2.CC_STAT_AREA]
if (area > 250):
componentMask = (label_ids == i).astype("uint8") * 255
output = cv2.bitwise_or(output, componentMask)
plt.imshow(output, cmap='gray')
plt.title('Filtered components')
plt.xticks([]),plt.yticks([])
plt.show()
#plt.figure(figsize=(0.05*img.shape[0],0.05*img.shape[1]))
#plt.subplot(132)
#plt.title('Dilated'), plt.xticks([]), plt.yticks([])
#kernel = np.ones((5,5),np.uint8)
#dilate = cv2.dilate(thresh,kernel,iterations = 1)
#plt.imshow(dilate,cmap = 'gray')
#plt.subplot(133)
#plt.title('Closed'), plt.xticks([]), plt.yticks([])
#kernel = np.ones((5,5),np.uint8)
#closing = cv2.erode(dilate,kernel,iterations = 1)
#plt.imshow(closing,cmap = 'gray')
#plt.show()
A global threshold value may not be good if the input image corresponds to an scene with varying lighting conditions. In that case, adaptive thresholding is preferable.
dst=cv2.adaptiveThreshold(src,maxval,adaptiveMethod, thresholdType, blockSize, C)
adaptiveMethod - It decides how thresholding value is calculated.
cv2.ADAPTIVE_THRESH_MEAN_C: threshold value is the mean of neighbourhood area minus the constant C.
cv2.ADAPTIVE_THRESH_GAUSSIAN_C: threshold is the gaussian mean of neighbourhood values minus C.
thresholdType - must be either THRESH_BINARY or THRESH_BINARY_INV
Below piece of code compares global thresholding and adaptive thresholding for an image with varying illumination:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(seg_images_path+'rochas.jpg',0)
img = cv2.medianBlur(img,5)
C=10
blockSize=25
ret,th1 = cv2.threshold(img,100,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY,blockSize,C)
th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,blockSize,C)
print("threshold: {}".format(ret))
titles = ['Original Image', 'Global Thresholding',
'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]
plt.figure(figsize=(0.025*img.shape[0],0.025*img.shape[1]))
for i in range(4):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()
threshold: 100.0
#CODE HERE
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(seg_images_path+'penumbra.jpg',0)
img = cv2.medianBlur(img,5)
C=7
blockSize=9
ret,th1 = cv2.threshold(img,100,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY,blockSize,C)
th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,blockSize,C)
print("threshold: {}".format(ret))
titles = ['Original Image', 'Global Thresholding',
'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]
plt.figure(figsize=(0.025*img.shape[0],0.025*img.shape[1]))
for i in range(4):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()
# Explanation of the decisions made :
# No major test were done for this use case. Both functions used are quite
# simple to use and straightforward. All the big work is done implicetely.
# Discussion and results :
# As we can see, there's still some parts of the sky detected when using the
# Adaptive Mean Thresholding, less while using the Adaptive Gaussian
# Thresholding. Overall, this second technique gives the best result for this
# specific image (it depends on the context in general).
threshold: 100.0
In global thresholding, we used an arbitrary value for threshold value, but, how can we know a value we selected is good or not? Answer is trial and error method. But now consider a bimodal image (an image whose histogram has two peaks). For that image, Otsu’s binarization automatically calculates a threshold in the valley of bimodal histograms. (For images which are not bimodal, binarization will not be accurate.)
cv2.threshold() function can be used with an extra flag, cv2.THRESH_OTSU, and threshold value equals to zero. Then the algorithm finds the optimal threshold value that is returned as retVal. If Otsu thresholding is not used, retVal is same as the passed threshold value.
Check out below example. Input image is a noisy image. In first case, a global thresholding for a value of 127 is applied. In second case, it is applied Otsu’s thresholding directly. In third case, the image is filtered with a 5x5 gaussian kernel to remove the noise, then an Otsu thresholding is applied.
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(seg_images_path+ 'noisy.png',0)
# global thresholding
ret1,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
# Otsu's thresholding
ret2,th2 = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(img,(3,3),0)
ret3,th3 = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
print('ret1: {}'.format(ret1))
print('ret2: {}'.format(ret2))
print('ret3: {}'.format(ret3))
# plot all the images and their histograms
images = [img, 0, th1,
img, 0, th2,
blur, 0, th3]
titles = ['Original Noisy Image','Histogram','Global Thresholding (ret1=127)',
'Original Noisy Image','Histogram',"Otsu's Thresholding",
'Gaussian filtered Image','Histogram',"Otsu's Thresholding"]
plt.figure(figsize=(0.03*img.shape[0],0.03*img.shape[1]))
for i in range(3):
plt.subplot(3,3,i*3+1),plt.imshow(images[i*3],'gray')
plt.title(titles[i*3]), plt.xticks([]), plt.yticks([])
plt.subplot(3,3,i*3+2),plt.hist(images[i*3].ravel(),256)
plt.title(titles[i*3+1]), plt.xticks([]), plt.yticks([])
plt.subplot(3,3,i*3+3),plt.imshow(images[i*3+2],'gray')
plt.title(titles[i*3+2]), plt.xticks([]), plt.yticks([])
plt.show()
ret1: 127.0 ret2: 148.0 ret3: 148.0
# CODE HERE
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(seg_images_path+ 'nature_night.jpg',0)
# global thresholding
ret1,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
# Otsu's thresholding
ret2,th2 = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# cv2.THRESH_BINARY, cv2.THRESH_BINARY_INV, cv2.THRESH_TRUNC,
# cv2.THRESH_TOZERO, cv2.THRESH_TOZERO_INV
# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(img,(3,3),0)
ret3,th3 = cv2.threshold(blur,0,255, cv2.THRESH_TRUNC+cv2.THRESH_OTSU)
print('ret1: {}'.format(ret1))
print('ret2: {}'.format(ret2))
print('ret3: {}'.format(ret3))
# plot all the images and their histograms
images = [img, 0, th1,
img, 0, th2,
blur, 0, th3]
titles = ['Original Noisy Image','Histogram','Global Thresholding (ret1=127)',
'Original Noisy Image','Histogram',"Otsu's Thresholding",
'Gaussian filtered Image','Histogram',"Otsu's Thresholding\n(The good one)"]
plt.figure(figsize=(0.03*img.shape[0],0.03*img.shape[1]))
for i in range(3):
plt.subplot(3,3,i*3+1),plt.imshow(images[i*3],'gray')
plt.title(titles[i*3]), plt.xticks([]), plt.yticks([])
plt.subplot(3,3,i*3+2),plt.hist(images[i*3].ravel(),256)
plt.title(titles[i*3+1]), plt.xticks([]), plt.yticks([])
plt.subplot(3,3,i*3+3),plt.imshow(images[i*3+2],'gray')
plt.title(titles[i*3+2]), plt.xticks([]), plt.yticks([])
plt.show()
# Explanation of the decisions made :
# Basicly, I only had three parameters to test : Size of the kernel
# for the Gaussian filtering, usage or not of "cv2.THRESH_OTSU",
# and which one of the other flags would be more convenient.
# The script gives the output of three main configurations (while
# not chosing the inversed flags).
# Discussion and results :
# As we can see in the three resulting images, the best configuration
# is with the flags "cv2.THRESH_TRUNC+cv2.THRESH_OTSU", as it helps
# to keep only the main object (the tree) in the foreground, and
# eliminate most of the surrounding effects in the background.
# A good result on this specific image.
ret1: 127.0 ret2: 56.0 ret3: 51.0
https://docs.opencv.org/4.3.0/d9/d61/tutorial_py_morphological_ops.html Morphological transformations are simple operations that change the shapes of binary regions. They need two inputs: (1) the original image, and (2) the structuring element or kernel which decides the nature of the operation. Four basic morphological operators are erosion, dilation, opening and closing,.
The basic idea of erosion is just like soil erosion, it erodes away the boundaries of foreground object. The kernel slides through the image (as in 2D convolution). A pixel in the original image (either 1 or 0) will be considered 1 only if all the pixels under the kernel is 1, otherwise it is eroded (set to zero). All the pixels near boundary will be discarded depending upon the size of the kernel. So the thickness or size of each foreground object (white regions) decreases in the image. It is useful for removing small white noise regions, detaching two connected objects etc. Here an example where a 5x5 kernel (matrix of ones) is used:
import cv2
import numpy as np
img = cv2.imread(seg_images_path+'hotel.jpg',0)
plt.imshow(img,'gray')
plt.title('Original')
plt.xticks([]),plt.yticks([])
plt.show()
ret1,thr = cv2.threshold(img,127,255,cv2.THRESH_BINARY_INV)
plt.imshow(thr,'gray')
plt.title('thresholded')
plt.xticks([]),plt.yticks([])
plt.show()
kernel = np.ones((5,5),np.uint8)
eroded = cv2.erode(thr,kernel,iterations = 1)
plt.imshow(eroded,'gray')
plt.title('eroded')
plt.xticks([]),plt.yticks([])
plt.show()
It is just opposite of erosion. Here, a pixel element is ‘1’ if at least one pixel under the kernel is ‘1’. So it increases the size of aeach white region (foreground object) in the image. Normally, in cases like noise removal, erosion is followed by dilation. While erosion removes white noise regions, it also shrinks our object, so we dilate it. It is also useful in joining broken parts of an object.
dilated = cv2.dilate(thr,kernel,iterations = 1)
plt.imshow(dilated,'gray')
plt.title('dilated')
plt.xticks([]),plt.yticks([])
plt.show()
Opening is the name for erosion followed by dilation. It is useful in removing noise, as we explained above. Here we use the function, cv2.morphologyEx()
opened = cv2.morphologyEx(thr, cv2.MORPH_OPEN, kernel)
plt.imshow(opened,'gray')
plt.title('opened')
plt.xticks([]),plt.yticks([])
plt.show()
Closing is reverse of Opening: Dilation followed by Erosion. It is useful in closing small holes inside the foreground objects, or small black points on the object.
closed = cv2.morphologyEx(thr, cv2.MORPH_CLOSE, kernel)
plt.imshow(closed,'gray')
plt.title('closed')
plt.xticks([]),plt.yticks([])
plt.show()
We manually created a rectangular structuring elements in the previous examples with help of Numpy. But in some cases, you may need elliptical/circular shaped kernels. So for this purpose, OpenCV has a function, cv2.getStructuringElement(). You just pass the shape and size of the kernel, and you get the desired kernel.
# Rectangular Kernel
cv2.getStructuringElement(cv2.MORPH_RECT,(5,5))
#array([[1, 1, 1, 1, 1],
# [1, 1, 1, 1, 1],
# [1, 1, 1, 1, 1],
# [1, 1, 1, 1, 1],
# [1, 1, 1, 1, 1]], dtype=uint8)
# Elliptical Kernel
cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
#array([[0, 0, 1, 0, 0],
# [1, 1, 1, 1, 1],
# [1, 1, 1, 1, 1],
# [1, 1, 1, 1, 1],
# [0, 0, 1, 0, 0]], dtype=uint8)
# Cross-shaped Kernel
cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
#array([[0, 0, 1, 0, 0],
# [0, 0, 1, 0, 0],
# [1, 1, 1, 1, 1],
# [0, 0, 1, 0, 0],
# [0, 0, 1, 0, 0]], dtype=uint8)
array([[0, 0, 1, 0, 0],
[0, 0, 1, 0, 0],
[1, 1, 1, 1, 1],
[0, 0, 1, 0, 0],
[0, 0, 1, 0, 0]], dtype=uint8)
Consider the image hotel.jpg, find the best parameters, for thresholding and morphological operations, to obtain a mask of the hotel as accurate as possible.
#SOLUTION HERE
import cv2
import numpy as np
img = cv2.imread(seg_images_path+'hotel.jpg',0)
plt.imshow(img,'gray')
plt.title('Original')
plt.xticks([]),plt.yticks([])
plt.show()
blur = cv2.GaussianBlur(img,(5,5),0)
# cv2.THRESH_BINARY, cv2.THRESH_BINARY_INV, cv2.THRESH_TRUNC,
# cv2.THRESH_TOZERO, cv2.THRESH_TOZERO_INV
ret1,thr = cv2.threshold(blur,0,255, cv2.THRESH_TOZERO_INV+cv2.THRESH_OTSU)
plt.imshow(thr,'gray')
plt.title('thresholded')
plt.xticks([]),plt.yticks([])
plt.show()
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(7,7))
opened = cv2.morphologyEx(thr, cv2.MORPH_CLOSE, kernel)
plt.imshow(opened,'gray')
plt.title('closed')
plt.xticks([]),plt.yticks([])
plt.show()
# Explanation of the decisions made :
# Multiple parameters to be tuned : Dimensions for the gaussian blur, flags
# for the threshold and the morphological transformation to be used.
# The parameter tuning task was quite exhausting, and several combinations
# were tested.
# Discussion and results :
# In ordre to minimise the differences between the hotel in the original
# image and in the mask, I settled on the values mentionned in the above
# script. In particular, I got better results using the Cross-Shaped kernel
# than using the other forms of kernels. The resulting mask is convenient
# for the original hotel, although with some more tuning, and luck, I feel
# like I couldv' gotten a better output.
https://docs.opencv.org/3.4.2/d1/d5c/tutorial_py_kmeans_opencv.html
K-Means clustering is unsupervised machine learning algorithm that aims to partition N observations into K clusters in which each observation belongs to the cluster with the nearest mean. A cluster refers to a collection of data points aggregated together because of certain similarities. For image segmentation, clusters here are different image colors.
cv2.kmeans() is the function in OpenCV for data clustering:
compactness, bestLabels, centers = cv2.kmeans(samples, K, bestLabels, criteria, attempts, flags)
criteria - iteration termination criteria. When any of these criteria is satisfied, algorithm iteration stops. It should be a tuple of 3 parameters (type, max_iter, epsilon):
- cv2.TERM_CRITERIA_EPS - stop if specified accuracy, epsilon, is reached.
- cv2.TERM_CRITERIA_MAX_ITER - stop after the specified number of iterations, max_iter.
- cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER - stop when any of the above condition is met.
attempts - umber of times the algorithm is executed using different initial labellings. The algorithm returns the labels that yield the best compactness. This compactness is returned as output.
import cv2
import numpy as np
import matplotlib.pyplot as plt
image = cv2.imread(seg_images_path+"landscape.png")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # convert to RGB
# reshape the image to a 1D array of pixels and 3 color values (RGB)
pixel_values = image.reshape((-1, 3)) #-1 represents image size
pixel_values = np.float32(pixel_values)# convert to float
# define stopping criteria
criteria = (cv2.TERM_CRITERIA_EPS+cv2.TERM_CRITERIA_MAX_ITER,100,0.2)
k = 3 # number of clusters (K)
_, labels, (centers) = cv2.kmeans(pixel_values,3,None,criteria,10, cv2.KMEANS_RANDOM_CENTERS)
centers = np.uint8(centers)# convert back to 8 bit values
segmented_image = centers[labels.flatten()] # convert all pixels to the color of the centroids
# reshape back to the original image dimension
segmented_image = segmented_image.reshape(image.shape)
plt.figure(figsize=(0.03*image.shape[0],0.03*image.shape[1]))
plt.subplot(121)
plt.imshow(image)
plt.title('original')
plt.xticks([]),plt.yticks([])
plt.subplot(122)
plt.imshow(segmented_image)
plt.title('segmented')
plt.xticks([]),plt.yticks([])
plt.show()
https://scikit-learn.org/stable/modules/generated/sklearn.cluster.MeanShift.html
Mean shift clustering aims to discover by itself the number and location of blobs of local maxima of hte probability distribution function. It is a centroid-based algorithm, which works by updating candidates for centroids to be the mean of the points within a given region. These candidates are then filtered in a post-processing stage to eliminate near-duplicates to form the final set of centroids.
class sklearn.cluster.MeanShift(*, bandwidth=None, seeds=None, bin_seeding=False, min_bin_freq=1, cluster_all=True, n_jobs=None, max_iter=300)
bandwidth - float, default=None, estimated using sklearn.cluster.estimate_bandwidth().
seeds - array-like of shape (n_samples, n_features), default=None. If None, the seeds are calculated by sklearn.cluster.get_bin_seeds() with bandwidth as the grid size and default values for other parameters.
bin_seeding bool, default=False. If true, initial kernel locations are not locations of all points, but rather the location of the discretized version of points, where points are binned onto a grid whose coarseness corresponds to the bandwidth. Setting this option to True will speed up the algorithm because fewer seeds will be initialized. Ignored if seeds argument is not None.
min_bin_freq - int, default=1. To speed up the algorithm, accept only those bins with at least min_bin_freq points as seeds.
cluster_all - bool, default=True. If true, then all points are clustered, even those orphans that are not within any kernel. Orphans are assigned to the nearest kernel. If false, then orphans are given cluster label -1.
n_jobs - int, default=None. The number of jobs to use for the computation of the n_init runs in parallel.
max_iter - int, default=300, maximum number of iterations, per seed point before the clustering operation terminates, if has not converged yet.
sklearn.cluster.estimate_bandwidth(X, *, quantile=0.3, n_samples=None, random_state=0, n_jobs=None)
Estimate the bandwidth, takes time at least quadratic in n_samples. For large datasets, it’s wise to set that parameter to a small value.
X - array-like of shape (n_samples, n_features) of input points.
quantile - float, default=0.3. Should be between [0, 1], 0.5 means that the median of all pairwise distances is used.
n_samples -int, default=None, number of samples to use. If not given, all samples are used.
random_state - int, RandomState instance, default=None. The generator used to randomly select the samples from input points for bandwidth estimation.
n_jobs - int, default=None, number of parallel jobs to run for neighbors search.
import cv2
import numpy as np
from sklearn.cluster import MeanShift, estimate_bandwidth
import matplotlib.pyplot as plt
image = cv2.imread(seg_images_path+"landscape.png")
image=cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
#Need to convert image into feature array based
#on rgb intensities
flat_image=np.reshape(image, [-1, 3])
#Estimate bandwidth
bandwidth2 = estimate_bandwidth(flat_image, quantile=0.12, n_samples=1000)
ms = MeanShift(bandwidth=bandwidth2, bin_seeding=True)
ms.fit(flat_image)
labels=ms.labels_
# Plot image vs segmented image
plt.figure(figsize=(0.03*image.shape[0],0.03*image.shape[1]))
plt.subplot(121)
plt.imshow(image)
plt.title('original')
plt.xticks([]),plt.yticks([])
plt.subplot(122)
plt.imshow(np.reshape(labels, [image.shape[0],image.shape[1]]),cmap='jet')
plt.title('segmented')
plt.xticks([]),plt.yticks([])
plt.show()
Compare the performance of k-means and mean-shift on ‘azores.jpg’ image. Choose the most convenient parameters. You can apply morphological operations to segmentation results to eliminate small irrelevant regions. Seek the best morphological operations and segmentation parameters to get the cleanest segmentation of the sky.
#SOLUTION HERE
import cv2
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import MeanShift, estimate_bandwidth
######################### K-Means ##############################
image = cv2.imread(seg_images_path+"azores.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
pixel_values = image.reshape((-1, 3))
pixel_values = np.float32(pixel_values)
criteria = (cv2.TERM_CRITERIA_EPS+cv2.TERM_CRITERIA_MAX_ITER,
500,0.1)
k = 3
_, labels, (centers) = cv2.kmeans(pixel_values,5,None,criteria,
10, cv2.KMEANS_PP_CENTERS)
centers = np.uint8(centers)
segmented_image = centers[labels.flatten()]
segmented_image = segmented_image.reshape(image.shape)
plt.figure(figsize=(0.03*image.shape[0],0.03*image.shape[1]))
plt.subplot(121)
plt.imshow(image)
plt.title('original')
plt.xticks([]),plt.yticks([])
plt.subplot(122)
plt.imshow(segmented_image)
plt.title('segmented using K-Means')
plt.xticks([]),plt.yticks([])
plt.show()
# Explanation of the decisions made :
# With respect to the different parameters that the K-Means implementation
# has under OpenCV, I focused mainly on the maximal number of iterations,
# goal accuracy, number of classes and centers initialisation technique.
# The first parameter I modified was the number of classes, as only 3
# classes are clearly not enough to detect more details in the sky.
# Therefore, I went with 5 classes, I increased the maximal number of
# iterations and minimized the goal accuracy. The script still maintained
# a reasonable execution time. The final parameter I modified was to choose
# to initialise the centers using the "cv2.KMEANS_PP_CENTERS" flag, as it
# added marginal improvements.
# Discussion and results :
# We have got more details on the sky, and the 5 classes are well convenient
# for our image !
#SOLUTION HERE
import cv2
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import MeanShift, estimate_bandwidth
######################### Mean-Shift ##############################
image = cv2.imread(seg_images_path+"azores.jpg")
image=cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
flat_image=np.reshape(image, [-1, 3])
bandwidth2 = estimate_bandwidth(flat_image, quantile=0.15,
n_samples=1000)
ms = MeanShift(bandwidth=bandwidth2, bin_seeding=True, cluster_all=False)
ms.fit(flat_image)
labels=ms.labels_
plt.figure(figsize=(0.03*image.shape[0],0.03*image.shape[1]))
plt.subplot(121)
plt.imshow(image)
plt.title('original')
plt.xticks([]),plt.yticks([])
plt.subplot(122)
plt.imshow(np.reshape(labels, [image.shape[0],image.shape[1]]),cmap='jet')
plt.title('segmented using Mean-Shift')
plt.xticks([]),plt.yticks([])
plt.show()
# Explanation of the decisions made :
# For the Mean-Shift algorithm, I primarely tested the parameters quantile,
# bin_seeding and cluster_all. For the bin_seeding, changing its value to False
# had a big negative influence on the performance of the algorithm, as it took
# too much time to process the single image, which is logical given the
# definition of the parameter. However, setting the cluster_all parameter to
# False instantly led to the sky having more details, as the algorithm didn't
# consider those details on the sky as orphans, and gave them their own cluster
# labeled -1 as the definition of the parameter states. Finally, I tested some
# values for the quantile parameter, before settling down on 0.15 which appeared
# to give us the optimal results.
# Discussion and results :
# Even though tweaking this algorithm led as well to more details on the sky,
# I feel that the K-Means algorithm is less agressive on assigning pixels to the
# different labels, as we can clearly see that the Mean-Shift algorithm misclassed
# more pixels than the K-Means. But again, with more parameters tuning certainly
# comes better performance !
import numpy as np
import cv2
from matplotlib import pyplot as plt
class Point(object):
def __init__(self, x, y):
self.x = x
self.y = y
def getX(self):
return self.x
def getY(self):
return self.y
# get the difference of gray value in image
def getGrayDiff(img, currentPoint, tmpPoint):
return abs(int(img[currentPoint.x, currentPoint.y]) - int(img[tmpPoint.x, tmpPoint.y]))
# get four or eight neighbors
def selectConnects(neighbor_num):
if neighbor_num == 8: # eight neighbors
connects = [Point(-1, -1), Point(0, -1), Point(1, -1), Point(1, 0), Point(1, 1),
Point(0, 1), Point(-1, 1), Point(-1, 0)]
elif neighbor_num == 4: # four neighbors
connects = [Point(0, -1), Point(1, 0), Point(0, 1), Point(-1, 0)]
else:
raise ValueError("The neighbor_num should be 4 or 8")
return connects
def regionGrow(img, mask, seed, thresh, neighbor_num=8, label=1):
""" single seed region grow algorithm """
height, weight = img.shape
connects = selectConnects(neighbor_num)
seedList = []
seedList.append(seed)
while (len(seedList) > 0):
currentPoint = seedList.pop(0)
mask[currentPoint.x, currentPoint.y] = label
for i in range(neighbor_num):
tmpX = currentPoint.x + connects[i].x
tmpY = currentPoint.y + connects[i].y
if tmpX < 0 or tmpY < 0 or tmpX >= height or tmpY >= weight or mask[tmpX, tmpY] >0:
continue
grayDiff = getGrayDiff(img, currentPoint, Point(tmpX, tmpY))
if grayDiff < thresh and mask[tmpX, tmpY] <1:
mask[tmpX, tmpY] = label
seedList.append(Point(tmpX, tmpY))
return mask
def find_undetermined(mask):
zero_pos = np.where(mask<1)
if len(zero_pos[0]) == 0:
return None
x = zero_pos[0][0]
y = zero_pos[1][0]
return Point(x, y)
def img_region_grow(img, label_in):
""" gray image region grow algorithm, different region will have different labels """
mask = np.zeros(img.shape)
thresh = 7
label = label_in
while True:
seed = find_undetermined(mask)
if seed is not None:
mask = regionGrow(img, mask, seed, thresh, neighbor_num=8, label=label)
# cv2.imshow("mask", mask)
#plt.imshow(mask,cmap = 'gray')
#plt.show()
label += 10
else:
print("Process Done!")
break
plt.imshow(mask, cmap='gray')
plt.xticks([]),plt.yticks([])
plt.show()
image = cv2.imread(seg_images_path+'noisy.png',0)
img_region_grow(image, 50)
Process Done!
https://docs.opencv.org/4.2.0/db/d27/tutorial_py_table_of_contents_feature2d.html
https://docs.opencv.org/4.2.0/dc/d0d/tutorial_py_features_harris.html
OpenCV has the following function for this purpose:
dst=cv2.cornerHarris(src, blockSize, ksize)
Exercise:Play with parameters to see the effect on the detection.
import cv2
import numpy as np
filename = feature_images_path+'bricks.jpg'
img = cv2.imread(filename)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray = np.float32(gray)
dst = cv2.cornerHarris(gray,3,3,0.2)
#result is dilated to easy visualization, not important
dst = cv2.dilate(dst,None)
# Threshold for an optimal value, it may vary depending on the image.
corners=img.copy()
corners[dst>0.001*dst.max()]=[0,0,255]
plt.imshow(corners)
plt.xticks([]),plt.yticks([])
plt.show()
Using cv2.threshold() in combination with the function cv2.connectedComponentsWithStats() you can retrieve the location of the most conspicuous Harris points.
cv2.connectedComponentsWithStats() returns the following information:
The bounding box of the connected component
The area (in pixels) of the component
The centroid/center (x, y)-coordinates of the component
ret, dst = cv2.threshold(dst,0.01*dst.max(),255,0)
dst = np.uint8(dst)
# find centroids
ret, labels, stats, centroids = cv2.connectedComponentsWithStats(dst)
points=centroids.astype(int)
corner=img.copy()
corner[dst>0.001*dst.max()]=[0,0,255]
plt.imshow(corner)
plt.xticks([]),plt.yticks([])
plt.show()
print (points)
[[135 106] [118 2] [ 63 4] [262 5] [ 5 5] [252 6] [ 4 11] [ 51 13] [ 62 13] [265 14] [106 15] [116 15] [121 16] [250 19] [106 22] [121 22] [253 28] [ 8 36] [138 33] [ 28 37] [ 60 37] [ 95 34] [263 33] [ 52 38] [121 37] [144 36] [ 77 39] [101 37] [ 98 42] [134 42] [264 41] [ 32 43] [ 68 44] [250 43] [259 47] [ 71 53] [248 55] [257 54] [ 86 55] [264 59] [ 90 64] [110 66] [250 69] [263 68] [ 26 70] [ 89 69] [122 70] [142 70] [120 77] [147 78] [218 78] [249 80] [161 81] [156 85] [ 7 89] [161 91] [177 94] [214 94] [247 95] [252 95] [181 101] [214 102] [ 28 103] [ 61 103] [174 104] [250 105] [ 3 108] [199 114] [ 45 116] [156 116] [209 115] [251 118] [ 27 120] [206 120] [262 122] [258 125] [209 127] [224 129] [ 27 129] [ 3 134] [248 133] [226 135] [248 141] [168 144] [242 146] [ 24 149] [262 150] [ 63 154] [ 42 158] [ 15 161] [ 58 165] [243 170] [205 188] [215 186] [ 41 209]]
OpenCV provides a convenient way to detect blobs and filter them based on different characteristics. The class SimpleBlobDetector implements a simple algorithm for extracting blobs from an image:
This class performs several filtrations of returned blobs. You should set filterBy to true/false to turn on/off corresponding filtration. Available filtrations:
Default values of parameters are tuned to extract dark circular blobs.
import cv2
import numpy as np;
from matplotlib import pyplot as plt
im = cv2.imread(feature_images_path+"corunha.jpg", cv2.IMREAD_GRAYSCALE)
# Set up the detector with default parameters.
detector = cv2.SimpleBlobDetector_create()
# Set up the detector with specific parameters.
params = cv2.SimpleBlobDetector_Params()
params.filterByColor=True #Filter by Color
params.blobColor=255
params.filterByArea = True # Filter by Area.
params.minArea = 20
params.maxArea = 4000
params.filterByCircularity = False # Filter by Circularity
#params.minCircularity = 0.5
params.filterByConvexity = False # Filter by Convexity
#params.minConvexity = 0.87
params.filterByInertia = False # Filter by Inertia
#params.minInertiaRatio = 0.8
params.minDistBetweenBlobs = 20 # Distance Between Blobs
# Set up the detector with specific parameters.
detector = cv2.SimpleBlobDetector_create(params)
keypoints = detector.detect(im) # Detect blobs.
# Draw detected blobs as red circles.
# cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures the size of the circle corresponds to the size of blob
im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
plt.imshow(im_with_keypoints), plt.title("Keypoints")
plt.xticks([]),plt.yticks([])
plt.show()
Use the blob detector to detect the ball in black-trap-rock.jpg image. Choose the most convenient parameters.
#SOLUTION HERE
import cv2
import numpy as np;
from matplotlib import pyplot as plt
im = cv2.imread(seg_images_path+"black-trap-rock.jpg", cv2.IMREAD_GRAYSCALE)
detector = cv2.SimpleBlobDetector_create()
params = cv2.SimpleBlobDetector_Params()
params.filterByColor=True
params.blobColor=255
params.filterByArea = True
params.minArea = 500
params.maxArea = 1000
params.filterByCircularity = False
#params.minCircularity = 0.9
params.filterByConvexity = True
params.minConvexity = 0.85
params.filterByInertia = False
#params.minInertiaRatio = 0.8
params.minDistBetweenBlobs = 20
detector = cv2.SimpleBlobDetector_create(params)
keypoints = detector.detect(im)
im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]), (0,0,255),
cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
plt.imshow(im_with_keypoints), plt.title("Keypoints")
plt.xticks([]),plt.yticks([])
plt.show()
# Explanation of the decisions made :
# At the start, I used the same parameters as in the first example of the Blob
# Detection. However, I had too much cercles detected. So I had to, in
# the parameters, increase the minimum area and decrease the maximum area to
# eliminate small circles extreme circles. I noticed aswell that the detection
# improved by turning on the convexity parameter, and setting its paramter to
# around 0.85.
# Discussion and results :
# As a result of the parameter tuning done, the script was able to detect the while
# golf ball successfully, without being influenced by the other white parts
# in the image.
https://docs.opencv.org/4.2.0/d1/d89/tutorial_py_orb.html https://docs.opencv.org/4.x/db/d95/classcv_1_1ORB.html
This algorithm was brought up as an efficient alternative to SIFT or SURF in computation cost and matching performance. We have to create an ORB object with the function, cv2.ORB_create(), which has a number of optional parameters. Most useful ones are:
retval=cv2.ORB_create([, nFeatures[, scaleFactor[, nlevels[, edgeThreshold[, firstLevel[, WTA_K[, scoreType[, patchSize[, fastThreshold]]]]]]]]])
Below is a simple code which shows the use of ORB.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread(seg_images_path+'hotel.jpg',0)
# Initiate detector
orb = cv2.ORB_create(nfeatures=5000)
# find the keypoints with ORB
kp = orb.detect(img,None)
# compute the descriptors with ORB
kp, des = orb.compute(img, kp)
# draw only keypoints location, not size and orientation
img2 = cv2.drawKeypoints(img, kp, None,color=(0,255,0), flags=0)
plt.imshow(img2)
plt.xticks([]),plt.yticks([])
plt.show()
Brute-Force matcher takes the descriptor of one feature in first set and compares it with all other features in second set using some distance calculation. Thw closest match is returned.
First we have to create the BFMatcher object using cv2.BFMatcher(). It takes two optional parameters:
retval=cv.BFMatcher_create([, normType[, crossCheck]])
normType - distance measurement to be used. By default, it is cv2.NORM_L2, but for binary string based descriptors like ORB, cv2.NORM_HAMMING should be used. If ORB is using VTA_K == 3 or 4, cv2.NORM_HAMMING2 should be used.
crossCheck - boolean, false by default. If true, the matcher returns only those matches with value (i,j) such that i-th descriptor in set A has j-th descriptor in set B as the best match and vice-versa.
Two important methods od tghis class are BFMatcher.match() and BFMatcher.knnMatch(). First one returns the best match. Second method returns k best matches where k is specified by the user.
Like cv2.drawKeypoints() draws keypoints, cv2.drawMatches() draws the matches. It stacks two images horizontally and draw lines from first image to second image showing best matches.
cv2.drawMatchesKnn() draws all the k best matches. If k=2, it will draw two match-lines for each keypoint. So we have to pass a mask if we want to selectively draw it.
Here, we will see a simple example on how to match features between two images. In this case, we have a query image and a test image. We will try to find the former in the latter using feature matching.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img1 = cv2.imread(feature_images_path+'houses1.jpg',0) # query image
img2 = cv2.imread(feature_images_path+'houses2.jpg',0) # test image
# Initiate detector
orb = cv2.ORB_create(nfeatures=5000)
# find the keypoints and descriptors
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
# create BFMatcher object
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
# Match descriptors.
matches = bf.match(des1,des2)
# Sort them in ascending order of their distance.
matches = sorted(matches, key = lambda x:x.distance)
# Draw first 100 matches.
img3 = cv2.drawMatches(img1,kp1,img2,kp2,matches[:100], None,flags=2)
plt.imshow(img3)
plt.axis('off')
plt.show()
Image alignment (or registration) is the technique of warping one image (or sometimes both images) so that the features in the two images line up perfectly.
In some medical applications, multiple scans of a tissue may be taken at slightly different times and images are registered to see the evolution in time. But, the most interesting application of image alignment is perhaps creating panoramas.
In general, 3D alignment requires depth information. However, when the two images are taken by rotating the camera about its optical axis (as in the case of panoramas), we can use the code below to align two views of the scene.
The function cv2.warpPerspective() applies a perspective transformation to an image:
dst=cv2.warpPerspective(src, h, dsize)
Here we present an example for two views of a scene, and we look for the perspective transformation:
import cv2
import numpy as np
from matplotlib import pyplot as plt
MAX_FEATURES = 500
GOOD_MATCH_PERCENT = 0.15
def alignImages(im1, im2):
# Convert images to grayscale
im1Gray = cv2.cvtColor(im1, cv2.COLOR_RGB2GRAY)
im2Gray = cv2.cvtColor(im2, cv2.COLOR_RGB2GRAY)
# Detect ORB features and compute descriptors.
orb = cv2.ORB_create(nfeatures=MAX_FEATURES)
kpts1, descriptors1 = orb.detectAndCompute(im1Gray,None)
kpts2, descriptors2 = orb.detectAndCompute(im2Gray,None)
# create BFMatcher object
matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
# Match features.
matches = matcher.match(descriptors1, descriptors2)
# Sort matches by score
matches = sorted(matches, key = lambda x:x.distance)
# Remove not so good matches
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
# Draw top matches
imMatches=cv2.drawMatches(im1Gray,kpts1,im2Gray,kpts2, matches, None,flags=2)
plt.imshow(imMatches), plt.title('Matched Features'), plt.show()
# cv2.imshow('Matched Features', imMatches)
# cv2.waitKey(0)
# cv2.destroyWindow('Matched Features')
#
# cv2.imwrite("matches.jpg", imMatches)
# Extract location of good matches
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = kpts1[match.queryIdx].pt
points2[i, :] = kpts2[match.trainIdx].pt
# Find homography
h, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
# Use homography to align im1 with im2
height1, width1, _ = im1.shape
height2, width2, _ = im2.shape
#Aligned image im1Reg, using a 3x3 transformation matrix
im1Reg = cv2.warpPerspective(im1, h, (width1 + width2, height1))
return im1Reg, h
if __name__ == '__main__':
# Read reference image
refFilename = feature_images_path+'houses2.jpg'
print("Reading reference image : ", refFilename)
imReference = cv2.imread(refFilename, cv2.IMREAD_COLOR)
imReference=cv2.cvtColor(imReference, cv2.COLOR_BGR2RGB)
# Read image to be aligned
imFilename = feature_images_path+'houses1.jpg'
print("Reading image to align : ", imFilename);
im = cv2.imread(imFilename, cv2.IMREAD_COLOR)
im=cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
plt.subplot(1,2,1)
plt.imshow(imReference), plt.title("Reference image")
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(im), plt.title("Second image")
plt.axis('off')
plt.show()
print("Aligning images ...")
# Registered image will be stored in imReg.
# The estimated homography will be stored in h.
imReg, h = alignImages(im,imReference)
# Write aligned image to disk.
#outFilename = "aligned.jpg"
#print("Saving aligned image : ", outFilename);
#cv2.imwrite(outFilename, imReg)
plt.imshow(imReg), plt.title("Aligned image")
plt.axis('off')
plt.show()
# Print estimated homography
print("Estimated homography : \n", h)
Reading reference image : ./Images/Feature_images/houses2.jpg Reading image to align : ./Images/Feature_images/houses1.jpg
Aligning images ...
Estimated homography : [[ 7.02434358e-01 2.83631214e-02 2.57990106e+02] [-1.19294145e-01 9.40458461e-01 1.01918471e+01] [-6.57807313e-04 3.92848527e-05 1.00000000e+00]]
Now, find the affine transformation between view4.jpg and view5.jg, and transform the second one and then stitch the two togetther to create a panorama: It can be done just by modifying the last part of the function alignImages().
height1, width1, _ = im1.shape
height2, width2, _ = im2.shape
im1Reg = cv2.warpPerspective(im1, h, (width1 + width2, height1))
im1Reg[0:height2, 0:width1] = im2
Choose the most convenient parameters.
#SOLUTION HERE
import cv2
import numpy as np
from matplotlib import pyplot as plt
MAX_FEATURES = 500
GOOD_MATCH_PERCENT = 0.15
def alignImages(im1, im2):
im1Gray = cv2.cvtColor(im1, cv2.COLOR_RGB2GRAY)
im2Gray = cv2.cvtColor(im2, cv2.COLOR_RGB2GRAY)
orb = cv2.ORB_create(nfeatures=MAX_FEATURES)
kpts1, descriptors1 = orb.detectAndCompute(im1Gray,None)
kpts2, descriptors2 = orb.detectAndCompute(im2Gray,None)
matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = matcher.match(descriptors1, descriptors2)
matches = sorted(matches, key = lambda x:x.distance)
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
imMatches=cv2.drawMatches(im1Gray,kpts1,im2Gray,kpts2, matches, None,flags=2)
plt.imshow(imMatches), plt.title('Matched Features'), plt.show()
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = kpts1[match.queryIdx].pt
points2[i, :] = kpts2[match.trainIdx].pt
h, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
height1, width1, _ = im1.shape
height2, width2, _ = im2.shape
im1Reg = cv2.warpPerspective(im1, h, (width1 + width2, height1))
im1Reg[0:height2, 0:width1] = im2
return im1Reg, h
if __name__ == '__main__':
refFilename = feature_images_path+'view4.jpg'
print("Reading reference image : ", refFilename)
imReference = cv2.imread(refFilename, cv2.IMREAD_COLOR)
imReference=cv2.cvtColor(imReference, cv2.COLOR_BGR2RGB)
imFilename = feature_images_path+'view5.jpg'
print("Reading image to align : ", imFilename);
im = cv2.imread(imFilename, cv2.IMREAD_COLOR)
im=cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
plt.subplot(1,2,1)
plt.imshow(imReference), plt.title("Reference image")
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(im), plt.title("Second image")
plt.axis('off')
plt.show()
print("Aligning images ...")
imReg, h = alignImages(im,imReference)
plt.imshow(imReg), plt.title("Aligned image")
plt.axis('off')
plt.show()
print("Estimated homography : \n", h)
# Explanation of the decisions made :
# Compared to the exercise 1, I have to say that at first, I thought the
# code would be complicated to understand. But I found out that its
# similar to the first exercise, its just that instead of manually
# choosing identical points between both views, we use the orbs instead
# as well as a brute force matcher between both images, which gives a
# more automated and more convenient script, with a more precise result
# as well.
# Discussion and results :
# Well, as we can see in the final image, I was actually able to create
# a panorama from just two images ! Taking into account how very
# convenient and intuitive that was, I have to say that this is an
# impressive performance from OpenCV !
# Comment on the whole course :
# Personally, I learn alot about the different algorithms and techniques
# used in the image analysis field, and I can say that I am more familiar
# with OpenCV (Although definitely not an expert yet, but that will come
# with time and hardwork :D ).
# Thank you !
Reading reference image : ./Images/Feature_images/view4.jpg Reading image to align : ./Images/Feature_images/view5.jpg
Aligning images ...
Estimated homography : [[ 7.54903457e-01 2.76855782e-01 2.38124397e+02] [-1.15573590e-01 1.02219726e+00 3.26526278e+01] [-3.33869286e-04 3.43080830e-04 1.00000000e+00]]